> One feature we need for QEMU/KVM on embedded Power Architecture is the > ability to do passthru assignment of SoC I/O devices and memory. An > important use case in embedded is creating static partitions-- > taking physical memory and I/O devices (non-PCI) and partitioning > them between the host Linux and several virtual machines. Things like > live migration would not be needed or supported in these types of > scenarios. > > SoC devices do not sit on a probeable bus and there are no identifiers > like 01:00.0 with PCI that we can use to identify devices-- the host > Linux kernel is made aware of SoC I/O devices from nodes/properties in a > device tree structure passed at boot. QEMU needs to generate a > device tree to pass to the guest as well with all the guest's virtual > and physical resources. Today a number of mostly complete guest device > trees are kept under ./pc-bios in QEMU, but this too static and > inflexible.
I doubt you're going to get generic passthrough of arbitrary devices working in a useful way. My expectation is that, at minimum, you'll need a bus specific proxy device. i.e. create a virtual device in qemu that responds to the guest, and happens poke at a host device rather than emulating things directly. For busses like I2C this is fairly trivial - all communication with the device goes down a single well defined and easily proxied channel. For more complex busses you end up having to emulate a lot more. Basically you have to emulate everything that is different between the host and guest. If that happens to include device specific state then you loose. Using PCI devices as an example: The resources provided by the device are self-describing, so proxying those is fairly straightforward, and doesn't even require manual configuration. However replicating the environment seen by the device is trickier as PCI devices can initiate memory accesses (i.e. bus- master). For machines without an IOMMU this means passthrough in general can't work, and substantial amounts of device specific knowledge is required. You'd need to intercept and modify and/oor proxy all data relating to DMA addresses. In practice you need to emulate an IOMMU inside qemu (so you can determine the address space accessed by the device), and arrange for the host IOMMU to present the same virtual address space to the real device. Paul