Hello bugproxy, or anyone else affected,

Accepted qemu into bionic-proposed. The package will build now and be
available at https://launchpad.net/ubuntu/+source/qemu/1:2.11+dfsg-
1ubuntu7.21 in a few hours, and then in the -proposed repository.

Please help us by testing this new package.  See
https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how
to enable and use -proposed.  Your feedback will aid us getting this
update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug,
mentioning the version of the package you tested and change the tag from
verification-needed-bionic to verification-done-bionic. If it does not
fix the bug for you, please add a comment stating that, and change the
tag to verification-failed-bionic. In either case, without details of
your testing we will not be able to proceed.

Further information regarding the verification process can be found at
https://wiki.ubuntu.com/QATeam/PerformingSRUVerification .  Thank you in
advance for helping!

N.B. The updated package will be released to -updates after the bug(s)
fixed by this package have been verified and the package has been in
-proposed for a minimum of 7 days.

** Changed in: qemu (Ubuntu Bionic)
       Status: Triaged => Fix Committed

** Tags added: verification-needed verification-needed-bionic

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1847948

Title:
  Improve NVMe guest performance on Bionic QEMU

Status in The Ubuntu-power-systems project:
  In Progress
Status in linux package in Ubuntu:
  Fix Released
Status in qemu package in Ubuntu:
  Fix Released
Status in linux source package in Bionic:
  Won't Fix
Status in qemu source package in Bionic:
  Fix Committed

Bug description:
  [Impact]

   * In the past qemu has generally not allowd MSI-X BAR mapping on VFIO.
     But there can be platforms (like ppc64 spapr) that can and want to do
     exactly that.

   * Backport two patches from upstream (in since qemu 2.12 / Disco).

   * Due to that there is a tremendous speedup, especially useful with page
     size bigger than 4k. This avoids that being split into chunks and makes
     direct MMIO access possible for the guest.

  [Test Case]

   * On ppc64 pass through an NVME device to the guest and run I/O
     benchmarks, see below for Details how to set that up.
     Note: this needs the HWE kernel or another kernel fixup for [1].
     Note: the test should also be done with the non-HWE kernel, the 
     expectation there is that it would not show the perf benefits, but 
     still work fine

  [Regression Potential]

   * Changes:
     a) if the host driver allows mapping of MSI-X data the entire BAR is
        mapped. This is only done if the kernel reports that capability [1].
        This ensures that only on kernels able to do so qemu does expose the
        new behavior (safe against regression in that regard)
     b) on ppc64 MSI-X emulation is disabled for VFIO devices this is local
        to just this HW and will not affect other HW.

     Generally the regressions that come to mind are slight changes in
     behavior (real HW vs the former emulation) that on some weird/old
     guests could cause trouble. But then it is limited to only PPC where
     only a small set of certified HW is really allowed.

     The mapping that might be added even on other platforms should not
     consume too much extra memory as long as it isn't used. Further since
     it depends on the kernel capability it isn't randomly issues on kernels
     where we expect it to fail.

     So while it is quite a change, it seems safe to me.

  [Other Info]

   * I know, one could as well call that a "feature", but it really is a
     performance bug fix more than anything else. Also the SRU policy allows
     exploitation/toleration of new HW especially for LTS releases.
     Therefore I think this is fine as SRU.

  [1]:
  
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=a32295c612c57990d17fb0f41e7134394b2f35f6

  == Comment: #0 - Murilo Opsfelder Araujo  - 2019-10-11 14:16:14 ==

  ---Problem Description---
  Back-port the following patches to Bionic QEMU to improve NVMe guest 
performance by more than 200%:

  ?vfio-pci: Allow mmap of MSIX BAR?
  
https://git.qemu.org/?p=qemu.git;a=commit;h=ae0215b2bb56a9d5321a185dde133bfdd306a4c0

  ?ppc/spapr, vfio: Turn off MSIX emulation for VFIO devices?
  
https://git.qemu.org/?p=qemu.git;a=commit;h=fcad0d2121976df4b422b4007a5eb7fcaac01134

  ---uname output---
  na

  ---Additional Hardware Info---
  0030:01:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd NVMe 
SSD Controller 172Xa/172Xb (rev 01)

  Machine Type = AC922

  ---Debugger---
  A debugger is not configured

  ---Steps to Reproduce---
   Install or setup a guest image and boot it.

  Once guest is running, passthrough the NVMe disk to the guest using
  the XML:

  host$ cat nvme-disk.xml
  <hostdev mode='subsystem' type='pci' managed='no'>
     <driver name='vfio'/>
      <source>
          <address domain='0x0030' bus='0x01' slot='0x00' function='0x0'/>
      </source>
  </hostdev>

  host$ virsh attach-device <domain> nvme-disk.xml --live

  On the guest, run fio benchmarks:

  guest$ fio --direct=1 --rw=randrw --refill_buffers --norandommap
  --randrepeat=0 --ioengine=libaio --bs=4k --rwmixread=100 --iodepth=16
  --runtime=60 --name=job1 --filename=/dev/nvme0n1 --numjobs=4

  Results are similar with numjobs=4 and numjobs=64, respectively:

     READ: bw=385MiB/s (404MB/s), 78.0MiB/s-115MiB/s (81.8MB/s-120MB/s), 
io=11.3GiB (12.1GB), run=30001-30001msec
     READ: bw=382MiB/s (400MB/s), 2684KiB/s-12.6MiB/s (2749kB/s-13.2MB/s), 
io=11.2GiB (12.0GB), run=30001-30009msec

  With the two patches applied, performance improved significantly for
  numjobs=4 and numjobs=64 cases, respectively:

     READ: bw=1191MiB/s (1249MB/s), 285MiB/s-309MiB/s (299MB/s-324MB/s), 
io=34.9GiB (37.5GB), run=30001-30001msec
     READ: bw=4273MiB/s (4481MB/s), 49.7MiB/s-113MiB/s (52.1MB/s-119MB/s), 
io=125GiB (134GB), run=30001-30005msec

  Userspace tool common name: qemu

  Userspace rpm: qemu

  The userspace tool has the following bit modes: 64-bit

  Userspace tool obtained from project website:  na

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu-power-systems/+bug/1847948/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to