Re: [ceph-users] ceph cluster performance

Mark Nelson Wed, 30 Oct 2013 09:39:12 -0700

On 10/30/2013 09:05 AM, Dinu Vlad wrote:

Hello,


I've been doing some tests on a newly installed ceph cluster:

# ceph osd create bench1 2048 2048
# ceph osd create bench2 2048 2048
# rbd -p bench1 create test
# rbd -p bench1 bench-write test --io-pattern rand
elapsed:   483  ops:   396579  ops/sec:   820.23  bytes/sec: 2220781.36

# rados -p bench2 bench 300 write --show-time
# (run 1)
Total writes made:      20665
Write size:             4194304
Bandwidth (MB/sec):     274.923

Stddev Bandwidth:       96.3316
Max bandwidth (MB/sec): 748
Min bandwidth (MB/sec): 0
Average Latency:        0.23273
Stddev Latency:         0.262043
Max latency:            1.69475
Min latency:            0.057293

These results seem to be quite poor for the configuration:

MON: dual-cpu Xeon E5-2407 2.2 GHz, 48 GB RAM, 2xSSD for OS
OSD: dual-cpu Xeon E5-2620 2.0 GHz, 64 GB RAM, 2xSSD for OS (on-board 
controller), 18 HDD 1TB 7.2K rpm SAS for OSD drives and 6 SSDs (SATA) for 
journal, attached to a LSI 9207-8i controller.
All servers have dual 10GE network cards, connected to a pair of dedicated 
switches. Each SSD has 3 10 GB partitions for journals.

Agreed, you should see much higher throughput with that kind of storagesetup. What brand/model SSDs are these? Also, what brand and model ofchassis? With 24 drives and 8 SSDs I could push 2GB/s (no replicationthough) with a couple of concurrent rados bench processes going on ourSC847A chassis, so ~550MB/s aggregate throughput for 18 drives and 6SSDs is definitely on the low side.

I'm actually not too familiar with what the RBD benchmarking commandsare doing behind the scenes. Typically I've tested fio on top of afilesystem on RBD.


Using ubuntu 13.04, ceph 0.67.4, XFS for backend storage. Cluster was installed 
using ceph-deploy. ceph.conf pretty much out of the box (diff from default 
follows)

osd_journal_size = 10240
osd mount options xfs = "rw,noatime,nobarrier,inode64"
osd mkfs options xfs = "-f -i size=2048"

[osd]
public network = 10.4.0.0/24
cluster network = 10.254.254.0/24

All tests were run from a server outside the cluster, connected to the storage 
network with 2x 10 GE nics.

I've done a few other tests of the individual components:
- network: avg. 7.6 Gbit/s (iperf, mtu=1500), 9.6 Gbit/s (mtu=9000)
- md raid0 write across all 18 HDDs - 1.4 GB/s sustained throughput
- fio SSD write (xfs, 4k blocks, directio): ~ 250 MB/s, ~55K IOPS

What you might want to try doing is 4M direct IO writes using libaio anda high iodepth to all drives (spinning disks and SSDs) concurrently andsee how both the per-drive and aggregate throughput is.

With just SSDs, I've been able to push the 9207-8i up to around 3GB/swith Ceph writes (1.5GB/s if you don't count journal writes), butperhaps there is something interesting about the way the hardware issetup on your system.


I'd appreciate any suggestion that might help improve the performance or 
identify a bottleneck.

Thanks
Dinu



_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] ceph cluster performance

Reply via email to