Hi, These patches are intended to introduce Intel IOMMU (VT-d) emulation to q35 chipset. The major job in these patches is to add support for emulating Intel IOMMU according to the VT-d specification, including basic responses to CSRs accesses, the logic of DMAR (DMA remapping) and DMA memory address translations.
Features implemented for now are: 1. Response to important CSRs accesses; 2. DMAR (DMA remapping) without PASID support; 3. Use register-based invalidation for IOTLB and context cache invalidation; 4. Add DMAR table to ACPI tables to expose VT-d to BIOS; 5. Add "-machine iommu=on|off" option to enable/disable VT-d; 6. Only one DMAR unit for all the devices of PCI Segment 0. Testing: 1. L1 guest with Linux with intel_iommu=on can interact with VT-d and boot smoothly, and I can see info about VT-d in log of kernel; 2. Run L1 with VT-d, L2 guest with Linux can boot smoothly withou PCI device passthrough; 3. Run L1 with VT-d and "-soundhw ac97 (QEMU_AUDIO_DRV=none)", then assign the sound card to L2; L2 can boot smoothly with legacy PCI assignment; 4. Jailhouse hypervisor seems to run smoothly for now (tested by Jan). 5. Run L1 with VT-d and e1000 network card, then assign e1000 to L2; L2 will be STUCK when booting. This still remains unsolved now. As far as I know, I suppose that the L2 crashes when doing e1000_probe(). The QEMU of L1 will dump something with "KVM: entry failed, hardware error 0x0", and the KVM of host will print "nested_vmx_exit_handled failed vm entry 7". Unlike assigning the sound card, after being assigned to L2, there is no translation entry of e1000 through VT-d, which I think means that e1000 doesn't issue any DMA access during the boot of L2. Sometimes the kernel of L2 will print "divide error" during booting. Can someone help me with this? Any help is appreciated! :) 6. VFIO is tested and is the same as legacy pci assignment. I have some questions want to consult here: 1. Now the struct IntelIOMMUState is a member of MCHPCIState. VT-d is registered as TYPE_SYS_BUS_DEVICE but registers its configuration MemoryRegion as subregion of mch->pci_address_space. Is this correct? Another thought comes to my mind is using sysbus_mmio_map() to map the MemoryRegion of VT-d. But I am not sure. And maybe there are more improper usage of the QOM. 2. For declaration of porinter of pointer, like VTDAddressSpace **address_spaces, checkpatch.pl will warn that "ERROR: need consistent spacing around '*' (ctx:WxO)". Is checkpatch.pl wrong? TODO: 1. Fix the bug of legacy PCI assignment; 2. Clear up codes related to migration. 3. Queued Invalidation; 4. Basic fault reporting; 5. Caching propertities of IOTLB; Changes since v1: *address reviewing suggestions given by Michael, Paolo, Stefan and Jan -split intel_iommu.h to include/hw/i386/intel_iommu.h and hw/i386/intel_iommu_internal.h -change the copyright information -change D() to VTD_DPRINTF() -remove dead code -rename constant definitions with consistent prefix VTD_ -rename some struct definitions according to QEMU standard -rename some CSRs access functions -use endian-save functions to access CSRs -change machine option to "iommu=on|off" Thanks very much! Git trees: https://github.com/tamlok/qemu Le Tan (3): intel-iommu: introduce Intel IOMMU (VT-d) emulation intel-iommu: add DMAR table to ACPI tables intel-iommu: add Intel IOMMU emulation to q35 and add a machine option "iommu" as a switch hw/core/machine.c | 27 +- hw/i386/Makefile.objs | 1 + hw/i386/acpi-build.c | 41 ++ hw/i386/acpi-defs.h | 70 ++++ hw/i386/intel_iommu.c | 911 +++++++++++++++++++++++++++++++++++++++++ hw/i386/intel_iommu_internal.h | 257 ++++++++++++ hw/pci-host/q35.c | 72 +++- include/hw/boards.h | 1 + include/hw/i386/intel_iommu.h | 75 ++++ include/hw/pci-host/q35.h | 2 + qemu-options.hx | 5 +- vl.c | 4 + 12 files changed, 1457 insertions(+), 9 deletions(-) create mode 100644 hw/i386/intel_iommu.c create mode 100644 hw/i386/intel_iommu_internal.h create mode 100644 include/hw/i386/intel_iommu.h -- 1.9.1