On 4/4/19 11:24 AM, Alex Williamson wrote:
On Wed, 3 Apr 2019 23:31:22 -0500
Shawn Anastasio <sh...@anastas.io> wrote:
On 4/3/19 10:23 PM, Alex Williamson wrote:
On Wed, 3 Apr 2019 22:01:14 -0500
Shawn Anastasio <sh...@anastas.io> wrote:
Hello all,
I'm currently writing an application that makes use of Qemu's ivshmem
shared memory mechanism, which exposes shared memory regions from the
host via PCI-E BARs. MSI-X interrupts that are tied to host eventfds are
also exposed.
Since ivshmem doesn't have an in-tree kernel driver, I have been using
VFIO's NOIOMMU mode to interface with the device. This works wonderfully
for both BAR mapping and MSI-X interrupts. Unfortunately though, binding
the ivshmem device to vfio_pci to use it in this way results in a kernel
taint. I understand that this is because without an IOMMU, VFIO/Linux
has no way of preventing devices from performing malicious access to
other system memory. In the case of ivshmem though, the device does not
have any DMA capabilities.
The MSI-X interrupt is a DMA.
I hadn't realized this. That means then without an IOMMU, an
MSI-X capable device is capable of reading/writing arbitrary
memory?
Writing at least, this is why even with an IOMMU there's an opt-in if
that IOMMU lacks interrupt remapping support.
Understood. That makes sense.
This has created a situation in which the
safest possible way to access the device (a kernel driver would be
inherently less safe, UIO can't access the MSI-X functionality of the
device) results in a kernel taint, when other, less safe methods don't.
MSI-X support in UIO was rejected because MSI-X is a DMA and UIO does
not support devices that do DMA. Vfio-noiommu was a compromise to
allow using the vfio API, but recognizing that it's inherently unsafe.
In light of this, I propose a change to the VFIO framework that would
allow use cases such as this without a kernel taint. One solution I see
is only tainting when PCI devices with DMA capabilities are bound to
VFIO. It is my understanding that a device's DMA capability can be
determined by checking the Bus Mastering flag in the device's PCI
configuration space, so something like this should be feasible.
The bus master bit is not a capability for probing, enabling bus master
allows a device to perform DMA, including signaling via MSI
interrupts. No bus master, no MSI.
Perhaps an additional NOIOMMU mode could be introduced which only allows
devices which meet this criteria, too (VFIO_NOIOMMU_NODMA_IOMMU?).
Along with a separate Kconfig option, this would allow users to enable
this safe usage at kernel build time, while still preventing the
possibility of an unsafe DMA capable device from being used.
I'm curious to hear feedback on this. If this is something that can be
merged, I'd be more than happy to write a patch.
Add a vIOMMU to your VM configuration (ie. intel-iommu) and use proper
vfio in the guest. Thanks,
I had looked into this, but my application also targets ppc64, and a
cross-platform is therefore necessary.
Strangely enough when booting a VM on ppc64, the kernel /does/ report
an IOMMU, but there's only 1 group that contains all devices, so it
doesn't seem usable.
Yes, AIUI ppc64 PAPR machines always have an IOMMU and there is a
SPAPR IOMMU model in vfio. Maybe work with QEMU ppc64 developers to
figure out how the ivshmem device can be in its own group. This
probably requires configuring the VM with another PCI host bridge and
attaching the ivshmem device under it.
Interesting. I'll contact the QEMU developers about this.
Thanks for the pointer.
I guess it all boils down to this - does this usage of VFIO-NOIOMMU
with an MSI-X device constitute a security risk? If so, it seems
I'll have no choice but to write a kernel driver for a cross-platform
solution.
There is no property we can detect about a PCI device to determine that
it doesn't support DMA. All PCI device have DMA available to them.
Clearly we can't simply enforce that bus master is never enabled
because that breaks your use case of needing MSI interrupts and
presumes devices actually honor that bit and don't have more nefarious
ways of enabling it. So if we have no way to know the device
capabilities or the intention of the user, or exploitability of the
user, I don't see how we can create a policy that singles out this use
case as trusted. Thanks,
I see. Thank you for the explanation.
Thanks,
Shawn
_______________________________________________
vfio-users mailing list
vfio-users@redhat.com
https://www.redhat.com/mailman/listinfo/vfio-users