Big picture, I'm working on getting an openstack deployment going using
ceph-backed volumes, but I'm running into really poor disk performance, so
I'm in the process of simplifying things to isolate exactly where the
problem lies.

The machines I'm using are HP Proliant DL160 G6 machines with 72GB of RAM.
All the hardware virtualization features are turned on. Host OS is Ubuntu
14.04, using deadline IO scheduler. I've run a variety of benchmarks to
make sure the disks are working right, and they seem to be. Everything
indicates bare metal write speeds to a single disk in the ~100MB/s
ballpark. Some tests report as high as 120MB/s.

To try to isolate the problem I've done some testing with a very simple [1]
qemu invocation on one of the host machines. Inside that VM, I get about
50MB/s write throughput. I've tested with both qemu 2.0 and 1.7 and gotten
similar results. For quick testing I'm using a simple dd command [2] to get
a sense of where things lie. This has consistently produced results near
what more intensive synthetic benchmarks (iozone and dbench) produced. I
understand that I should be expecting closer to 80% of bare metal
performance. It seems that this would be the first place to focus, to
understand why things aren't going well.

When running on a ceph-backed volume, I get closer to 15MB/s using the same
tests, and have as much as 50% iowait. Typical operations that take seconds
on bare metal take tens of seconds, or minutes in a VM. This problem
actually drove me to look at things with strace, and I'm finding streams of
FSYNC and PSELECT6 timeouts while the processes are running. More direct
tests of ceph performance are able to saturate the nic, pushing about
90MB/s. I have ganglia installed on the host machines, and when I am
running tests from within a vm ,the network throughput seems to be getting
artificially capped. Rather than the more "spiky" graph produced by the
direct ceph tests, I get a perfectly flat horizontal line at 10 or 20MB/s.

Any and all suggestions would be appreciated, especially if someone has a
similar deployment that I could compare notes with.

QH

1 - My testing qemu invocation: qemu-system-x86_64 -cpu host -m 2G -display
vnc=0.0.0.0:1 -enable-kvm -vga std -rtc base=utc -drive
if=none,id=blk0,cache=none,aio=native,file=/root/cirros.raw -device
virtio-blk-pci,drive=blk0,id=blk0

2 - simple dd performance test: time dd if=/dev/zero of=deleteme.bin bs=20M
count=256

Reply via email to