Thanks again for the input. I appreciate the suggestions and sharing your
experiences.

Yesterday I decided to start the whole process of benchmarking and proving
what is going on over again from the beginning and discovered that the
low-level benchmarks I was basing my assumptions on were wrong. The
performance I'm getting inside the VMs is in line with what I'm getting
elsewhere with the ceph volumes. So I'm changing gears again and working on
getting that to the level I need it to be. If something _real_ comes up in
qemu, I'll be back! (And I'll be lurking the list now that I'm signed up).

QH


On Fri, May 30, 2014 at 1:29 AM, Stephan von Krawczynski <sk...@ithnet.com>
wrote:

> Hello all,
>
> let me shortly tell you my own experiences with virtual disk performance.
> In short: I stopped using image files of any kind. I switched to lvm
> partitions. The reason is that your virtual hosts will choke if you copy
> files
> to your physical host over the net or on the host itself. I tried
> everything
> to prevent that but in the end I gave up. My host had 256 GB RAM and 32
> Xeon
> processors and it was impossible to copy a 10 GByte file to the host
> without
> freezing the virtuals. I do find this pretty ridiculous, but I do not blame
> qemu. The cause seems to be kernel-vfs related.
> If you want working and performant qemu virtual hosts, use disk-based
> virtual
> disks.
>
> Regards,
> Stephan
>
>
>
>
>
>
> On Fri, 30 May 2014 13:13:13 +1000
> Blair Bethwaite <blair.bethwa...@gmail.com> wrote:
>
> > Quentin,
> >
> > I doubt you'll get much useful help until you do a reasonable benchmark.
> > The dd you mention in the original post is just testing your guest's page
> > cache, i.e. it is not directio. I'd suggest grabbing a newish fio and
> using
> > the genfio tool to help quickly setup some more varied benchmarks.
> >
> > Regarding your Ceph config, you might like to elaborate on the cluster
> > setup and provide some detail as to the librbd options you're giving to
> > Qemu, plus what Ceph baseline you are comparing against.
> >
> >
> > On 30 May 2014 02:38, Quentin Hartman <qhart...@gmail.com> wrote:
> >
> > > I don't know what I changed, but this morning instances running on
> local
> > > storage are returning essentially bare-metal performance. The only
> > > difference I see versus my test scenario is that they are actually
> using
> > > qcow images instead of RAW. So, hurray! Assuming is stays performant,
> the
> > > local storage problem is fixed. Now I just need to get ceph-backed
> > > instances working right.
> > >
> > > Would this be the appropriate venue for that discussion, or would a
> ceph
> > > list be a better venue? I believe the problem lies with qemu's
> interaction
> > > with librbd since my direct tests of the ceph cluster indicate good
> > > performance.
> > >
> > >
> > > On Thu, May 29, 2014 at 10:14 AM, Quentin Hartman <qhart...@gmail.com>
> > > wrote:
> > >
> > >> I found this page: http://www.linux-kvm.org/page/Tuning_Kernel and
> all
> > >> of the recommended kernel options are enabled or built as modules
> which are
> > >> loaded.
> > >>
> > >>
> > >> On Thu, May 29, 2014 at 10:07 AM, Quentin Hartman <qhart...@gmail.com
> >
> > >> wrote:
> > >>
> > >>> It looks like that particular feature is already enabled:
> > >>>
> > >>> root@node13:~# dmesg | grep -e DMAR -e IOMMU
> > >>> [    0.000000] ACPI: DMAR 00000000bf77e0c0 000100 (v01    AMI
>  OEMDMAR
> > >>> 00000001 MSFT 00000097)
> > >>> [    0.105190] dmar: IOMMU 0: reg_base_addr fbffe000 ver 1:0 cap
> > >>> c90780106f0462 ecap f020f6
> > >>>
> > >>>
> > >>>
> > >>> On Thu, May 29, 2014 at 10:04 AM, Quentin Hartman <
> qhart...@gmail.com>
> > >>> wrote:
> > >>>
> > >>>> I do not. I did not know those were a thing. My next steps were to
> > >>>> experiment with different BIOS settings and kernel parameters, so
> this is a
> > >>>> very timely suggestion. Thanks for the reply. I would love to hear
> other
> > >>>> suggestions for kernel parameters that may be relevant.
> > >>>>
> > >>>> QH
> > >>>>
> > >>>>
> > >>>> On Thu, May 29, 2014 at 9:58 AM, laurence.schuler <
> > >>>> laurence.schu...@nasa.gov> wrote:
> > >>>>
> > >>>>>  On 05/28/2014 07:56 PM, Quentin Hartman wrote:
> > >>>>>
> > >>>>> Big picture, I'm working on getting an openstack deployment going
> > >>>>> using ceph-backed volumes, but I'm running into really poor disk
> > >>>>> performance, so I'm in the process of simplifying things to
> isolate exactly
> > >>>>> where the problem lies.
> > >>>>>
> > >>>>>  The machines I'm using are HP Proliant DL160 G6 machines with 72GB
> > >>>>> of RAM. All the hardware virtualization features are turned on.
> Host OS is
> > >>>>> Ubuntu 14.04, using deadline IO scheduler. I've run a variety of
> benchmarks
> > >>>>> to make sure the disks are working right, and they seem to be.
> Everything
> > >>>>> indicates bare metal write speeds to a single disk in the ~100MB/s
> > >>>>> ballpark. Some tests report as high as 120MB/s.
> > >>>>>
> > >>>>>  To try to isolate the problem I've done some testing with a very
> > >>>>> simple [1] qemu invocation on one of the host machines. Inside
> that VM, I
> > >>>>> get about 50MB/s write throughput. I've tested with both qemu 2.0
> and 1.7
> > >>>>> and gotten similar results. For quick testing I'm using a simple
> dd command
> > >>>>> [2] to get a sense of where things lie. This has consistently
> produced
> > >>>>> results near what more intensive synthetic benchmarks (iozone and
> dbench)
> > >>>>> produced. I understand that I should be expecting closer to 80% of
> bare
> > >>>>> metal performance. It seems that this would be the first place to
> focus, to
> > >>>>> understand why things aren't going well.
> > >>>>>
> > >>>>>  When running on a ceph-backed volume, I get closer to 15MB/s using
> > >>>>> the same tests, and have as much as 50% iowait. Typical operations
> that
> > >>>>> take seconds on bare metal take tens of seconds, or minutes in a
> VM. This
> > >>>>> problem actually drove me to look at things with strace, and I'm
> finding
> > >>>>> streams of FSYNC and PSELECT6 timeouts while the processes are
> running.
> > >>>>> More direct tests of ceph performance are able to saturate the
> nic, pushing
> > >>>>> about 90MB/s. I have ganglia installed on the host machines, and
> when I am
> > >>>>> running tests from within a vm ,the network throughput seems to be
> getting
> > >>>>> artificially capped. Rather than the more "spiky" graph produced
> by the
> > >>>>> direct ceph tests, I get a perfectly flat horizontal line at 10 or
> 20MB/s.
> > >>>>>
> > >>>>>  Any and all suggestions would be appreciated, especially if
> someone
> > >>>>> has a similar deployment that I could compare notes with.
> > >>>>>
> > >>>>>  QH
> > >>>>>
> > >>>>>  1 - My testing qemu invocation: qemu-system-x86_64 -cpu host -m 2G
> > >>>>> -display vnc=0.0.0.0:1 -enable-kvm -vga std -rtc base=utc -drive
> > >>>>> if=none,id=blk0,cache=none,aio=native,file=/root/cirros.raw -device
> > >>>>> virtio-blk-pci,drive=blk0,id=blk0
> > >>>>>
> > >>>>>  2 - simple dd performance test: time dd if=/dev/zero
> of=deleteme.bin
> > >>>>> bs=20M count=256
> > >>>>>
> > >>>>> Hi Quentin,
> > >>>>>  Do you have the passthrough options on the host kernel command
> line?
> > >>>>> I think it's intel_iommu=on
> > >>>>>
> > >>>>> --larry
> > >>>>>
> > >>>>>
> > >>>>
> > >>>
> > >>
> > >
> >
> >
> > --
> > Cheers,
> > ~Blairo
>
>

Reply via email to