Hello all, let me shortly tell you my own experiences with virtual disk performance. In short: I stopped using image files of any kind. I switched to lvm partitions. The reason is that your virtual hosts will choke if you copy files to your physical host over the net or on the host itself. I tried everything to prevent that but in the end I gave up. My host had 256 GB RAM and 32 Xeon processors and it was impossible to copy a 10 GByte file to the host without freezing the virtuals. I do find this pretty ridiculous, but I do not blame qemu. The cause seems to be kernel-vfs related. If you want working and performant qemu virtual hosts, use disk-based virtual disks.
Regards, Stephan On Fri, 30 May 2014 13:13:13 +1000 Blair Bethwaite <blair.bethwa...@gmail.com> wrote: > Quentin, > > I doubt you'll get much useful help until you do a reasonable benchmark. > The dd you mention in the original post is just testing your guest's page > cache, i.e. it is not directio. I'd suggest grabbing a newish fio and using > the genfio tool to help quickly setup some more varied benchmarks. > > Regarding your Ceph config, you might like to elaborate on the cluster > setup and provide some detail as to the librbd options you're giving to > Qemu, plus what Ceph baseline you are comparing against. > > > On 30 May 2014 02:38, Quentin Hartman <qhart...@gmail.com> wrote: > > > I don't know what I changed, but this morning instances running on local > > storage are returning essentially bare-metal performance. The only > > difference I see versus my test scenario is that they are actually using > > qcow images instead of RAW. So, hurray! Assuming is stays performant, the > > local storage problem is fixed. Now I just need to get ceph-backed > > instances working right. > > > > Would this be the appropriate venue for that discussion, or would a ceph > > list be a better venue? I believe the problem lies with qemu's interaction > > with librbd since my direct tests of the ceph cluster indicate good > > performance. > > > > > > On Thu, May 29, 2014 at 10:14 AM, Quentin Hartman <qhart...@gmail.com> > > wrote: > > > >> I found this page: http://www.linux-kvm.org/page/Tuning_Kernel and all > >> of the recommended kernel options are enabled or built as modules which are > >> loaded. > >> > >> > >> On Thu, May 29, 2014 at 10:07 AM, Quentin Hartman <qhart...@gmail.com> > >> wrote: > >> > >>> It looks like that particular feature is already enabled: > >>> > >>> root@node13:~# dmesg | grep -e DMAR -e IOMMU > >>> [ 0.000000] ACPI: DMAR 00000000bf77e0c0 000100 (v01 AMI OEMDMAR > >>> 00000001 MSFT 00000097) > >>> [ 0.105190] dmar: IOMMU 0: reg_base_addr fbffe000 ver 1:0 cap > >>> c90780106f0462 ecap f020f6 > >>> > >>> > >>> > >>> On Thu, May 29, 2014 at 10:04 AM, Quentin Hartman <qhart...@gmail.com> > >>> wrote: > >>> > >>>> I do not. I did not know those were a thing. My next steps were to > >>>> experiment with different BIOS settings and kernel parameters, so this > >>>> is a > >>>> very timely suggestion. Thanks for the reply. I would love to hear other > >>>> suggestions for kernel parameters that may be relevant. > >>>> > >>>> QH > >>>> > >>>> > >>>> On Thu, May 29, 2014 at 9:58 AM, laurence.schuler < > >>>> laurence.schu...@nasa.gov> wrote: > >>>> > >>>>> On 05/28/2014 07:56 PM, Quentin Hartman wrote: > >>>>> > >>>>> Big picture, I'm working on getting an openstack deployment going > >>>>> using ceph-backed volumes, but I'm running into really poor disk > >>>>> performance, so I'm in the process of simplifying things to isolate > >>>>> exactly > >>>>> where the problem lies. > >>>>> > >>>>> The machines I'm using are HP Proliant DL160 G6 machines with 72GB > >>>>> of RAM. All the hardware virtualization features are turned on. Host OS > >>>>> is > >>>>> Ubuntu 14.04, using deadline IO scheduler. I've run a variety of > >>>>> benchmarks > >>>>> to make sure the disks are working right, and they seem to be. > >>>>> Everything > >>>>> indicates bare metal write speeds to a single disk in the ~100MB/s > >>>>> ballpark. Some tests report as high as 120MB/s. > >>>>> > >>>>> To try to isolate the problem I've done some testing with a very > >>>>> simple [1] qemu invocation on one of the host machines. Inside that VM, > >>>>> I > >>>>> get about 50MB/s write throughput. I've tested with both qemu 2.0 and > >>>>> 1.7 > >>>>> and gotten similar results. For quick testing I'm using a simple dd > >>>>> command > >>>>> [2] to get a sense of where things lie. This has consistently produced > >>>>> results near what more intensive synthetic benchmarks (iozone and > >>>>> dbench) > >>>>> produced. I understand that I should be expecting closer to 80% of bare > >>>>> metal performance. It seems that this would be the first place to > >>>>> focus, to > >>>>> understand why things aren't going well. > >>>>> > >>>>> When running on a ceph-backed volume, I get closer to 15MB/s using > >>>>> the same tests, and have as much as 50% iowait. Typical operations that > >>>>> take seconds on bare metal take tens of seconds, or minutes in a VM. > >>>>> This > >>>>> problem actually drove me to look at things with strace, and I'm finding > >>>>> streams of FSYNC and PSELECT6 timeouts while the processes are running. > >>>>> More direct tests of ceph performance are able to saturate the nic, > >>>>> pushing > >>>>> about 90MB/s. I have ganglia installed on the host machines, and when I > >>>>> am > >>>>> running tests from within a vm ,the network throughput seems to be > >>>>> getting > >>>>> artificially capped. Rather than the more "spiky" graph produced by the > >>>>> direct ceph tests, I get a perfectly flat horizontal line at 10 or > >>>>> 20MB/s. > >>>>> > >>>>> Any and all suggestions would be appreciated, especially if someone > >>>>> has a similar deployment that I could compare notes with. > >>>>> > >>>>> QH > >>>>> > >>>>> 1 - My testing qemu invocation: qemu-system-x86_64 -cpu host -m 2G > >>>>> -display vnc=0.0.0.0:1 -enable-kvm -vga std -rtc base=utc -drive > >>>>> if=none,id=blk0,cache=none,aio=native,file=/root/cirros.raw -device > >>>>> virtio-blk-pci,drive=blk0,id=blk0 > >>>>> > >>>>> 2 - simple dd performance test: time dd if=/dev/zero of=deleteme.bin > >>>>> bs=20M count=256 > >>>>> > >>>>> Hi Quentin, > >>>>> Do you have the passthrough options on the host kernel command line? > >>>>> I think it's intel_iommu=on > >>>>> > >>>>> --larry > >>>>> > >>>>> > >>>> > >>> > >> > > > > > -- > Cheers, > ~Blairo