On 11/20/2015 12:22 PM, Alex Williamson wrote:
On Fri, 2015-11-20 at 10:58 +0800, Jike Song wrote:
On 11/19/2015 11:52 PM, Alex Williamson wrote:
On Thu, 2015-11-19 at 15:32 +0000, Stefano Stabellini wrote:
On Thu, 19 Nov 2015, Jike Song wrote:
Hi Alex, thanks for the discussion.

In addition to Kevin's replies, I have a high-level question: can VFIO
be used by QEMU for both KVM and Xen?

No. VFIO cannot be used with Xen today. When running on Xen, the IOMMU
is owned by Xen.

Right, but in this case we're talking about device MMUs, which are owned
by the device driver which I think is running in dom0, right?  This
proposal doesn't require support of the system IOMMU, the dom0 driver
maps IOVA translations just as it would for itself.  We're largely
proposing use of the VFIO API to provide a common interface to expose a
PCI(e) device to QEMU, but what happens in the vGPU vendor device and
IOMMU backends is specific to the device and perhaps even specific to
the hypervisor.  Thanks,

Let me conclude this, and please correct me in case of any misread: the
vGPU interface between kernel and QEMU will be through VFIO, with a new
VFIO backend (instead of the existing type1), for both KVMGT and XenGT?

My primary concern is KVM and QEMU upstream, the proposal is not
specifically directed at XenGT, but does not exclude it either.  Xen is
welcome to adopt this proposal as well, it simply defines the channel
through which vGPUs are exposed to QEMU as the VFIO API.  The core VFIO
code in the Linux kernel is just as available for use in Xen dom0 as it
is for a KVM host. VFIO in QEMU certainly knows about some
accelerations for KVM, but these are almost entirely around allowing
eventfd based interrupts to be injected through KVM, which is something
I'm sure Xen could provide as well.  These accelerations are also not
required, VFIO based device assignment in QEMU works with or without
KVM.  Likewise, the VFIO kernel interface knows nothing about KVM and
has no dependencies on it.

There are two components to the VFIO API, one is the type1 compliant
IOMMU interface, which for this proposal is really doing nothing more
than tracking the HVA to GPA mappings for the VM.  This much seems
entirely common regardless of the hypervisor.  The other part is the
device interface.  The lifecycle of the virtual device seems like it
would be entirely shared, as does much of the emulation components of
the device.  When we get to pinning pages, providing direct access to
memory ranges for a VM, and accelerating interrupts, the vGPU drivers
will likely need some per hypervisor branches, but these are areas where
that's true no matter what the interface.  I'm probably over
simplifying, but hopefully not too much, correct me if I'm wrong.


Thanks for confirmation. For QEMU/KVM, I totally agree your point; However,
if we take XenGT to consider, it will be a bit more complex: with Xen
hypervisor and Dom0 kernel running in different level, it's not a straight-
forward way for QEMU to do something like mapping a portion of MMIO BAR
via VFIO in Dom0 kernel, instead of calling hypercalls directly.

I don't know if there is a better way to handle this. But I do agree that
channels between kernel and Qemu via VFIO is a good idea, even though we
may have to split KVMGT/XenGT in Qemu a bit.  We are currently working on
moving all of PCI CFG emulation from kernel to Qemu, hopefully we can
release it by end of this year and work with you guys to adjust it for
the agreed method.


The benefit of course is that aside from some extensions to the API, the
QEMU components are already in place and there's a lot more leverage for
getting both QEMU and libvirt support upstream in being able to support
multiple vendors, perhaps multiple hypervisors, with the same code.
Also, I'm not sure how useful it is, but VFIO is a userspace driver
interface, where here we're predominantly talking about that userspace
driver being QEMU.  It's not limited to that though.  A userspace
compute application could have direct access to a vGPU through this
model.  Thanks,



Alex

--
Thanks,
Jike
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to