This is v11 of Intel IR work. It is rebased to mst's branch "tags/for_upstream", commit:
"278a2a2 vmw_pvscsi: remove unnecessary internal msi state flag" This series mainly fixed several issues in v10 review comments, fixed one bug with RHEL guests, added acked-by for Paolo, and a fresh new rebase as mentioned above. To make it fast, I only did quick tests for this versiont. But at least it should cover basic functions like: IOAPIC, MSI, multiple vcpus, different guests (4.7 upstream and rhel 7.2), vhost, split/off irqchips. More tests to be done. Meanwhile, there are several pending issues to be solved, which is queued in my todo list and I'll continue the work after this series is merged. Online branch: https://github.com/xzpeter/qemu vtd-intr-v11 Please review, thanks. v11 changes (using v10 patch index): - patch 2: splitted into two patches, one to rename VTD_* macros, the other to provide x86_iommu_get_default(). [mst] - patch 5: removing DMAR_REPORT_F_INTR, with comments [mst] - patch 8: use size_t for ioapic_scope_size, removing ACPI_DMAR_DEV_SCOPE_TYPE_IOAPIC [mst] - patch 13: better handle "subhandle" field: this is a problem I found in v10 when testing with rhel guests. - patch 18: tiny tweaks to let QEMU build with --disable-kvm ("#ifdef CONFIG_KVM") - patch 24: still using v10 of this patch, and dropping v10.2 version (so this patch content is unmodified from v10) - new patch added: throw error when kernel irqchip=on is specified. - put all new trace-events into specific directories - add acked-by for Paolo (patch 16, 21-24, 26, 27) v10 changes: - Fix issue when specify more than 1 vcpus. This is introduced in v9 after rebased to Marcel's patches. The problem is that, before Marcel's patch, we will first create IOMMU then IOAPIC, while the order is switched after Marcel's changes. This affects patch 18 ("register IOMMU IEC notifier for ioapic") and I need to do the registration after IOAPIC realization. - Display readable error message if user specify more than one x86 vIOMMU, rather than an assertion fail. (patch 2) - Correct vtd iec notifier "global" parameter: if granularity bit is clear (not set), then it's a global invalidation (patch 17, inverted meaning for granularity). - added one more patch (patch 26) to add some trace events for irqchip msi routes operations. - rebase to latest master v9 changes: - addressed several possible acpi issue with BE machines, and comment fix [Igor] - removed patch 16 in v8 since it's useless after rebasing to Marcel's patches - move vtd_svt_mask into vtd_irte_get() and declare it as constant. - rebase to latest master, with Marcel's "-device intel-iommu" patch v2 - re-arrange patch order, moving x86-iommu to the beginning (so that I can add "intremap" property for it, which can be further shared by future AMD IOMMUs) - add device property "intremap" for X86 IOMMU device (new patch 4 in v9) - replace all existing references of MachineState.iommu_intr to device property X86IOMMUState.intr_supported, removing MachineState.iommu_intr - some other minor changes due to the rebase v8 changes: - rebase to latest master - patch 7 - remove VTD_IR_IOAPICEntry, which is useless now - fix possible issue on big endian machines for VTD_IRTE, VTD_IR_MSIAddress - patch 12 - fix endianess issue with bit-field defines: fix BE issue with VTD_MSIMessage, do cpu_to_*() or reverse when necessary on bit-field uses. - patch 19 - used le32_to_cpu() for dest_id, and added my s-o-b line beneath Jan's. v7 changes (using v6 patch index): - patch 10: trivial change in debug string (remove one more "\n") - patch 17-18: ioapic remote irr patches, sent seperately already. So removed from this series. - patch 24: - fix commit message: only irqfd msi routes are maintained, not all msi routes. - skip all IOAPIC msi entries (dev == NULL). We only need to housekeep irqfd users. - added patches - pick up Radim's patch on adding MHMV ecap bits [Radim] - remove all vtd_* patches, instead, use x86-iommu ones at the first place. This introduced lots of patch order changes and content changes, which affected from original patch 8 to the end. Sorry! [Jan] v6 changes: - patch 10: use write_with_attrs() rather than write(), preparing for SID verification [Jan] - patch 17-18: add r-b line from Radim [Radim] - new patch 19: put together Jan's EIM patch [Jan] - new patch 20: add SID validation process - new patch 21-22: introduce X86IOMMU class, which is the parent of IntelIOMMU class. Patch 21 only introduce the class and did nothing, patch 22 cleaned up all the vtd_*() hooks into x86 ones. This is only a start. In the future, we can abstract more things into X86IOMMU class, like iotlb, address spaces mgmt, etc. [Jan] - new patch 23-25: this is to do IEC notify to all irqfd consumers like vhost/vfio. patch 23 changed interface for kvm_irqchip_add_msi_route(), provide vector info rather than a raw MSI message. Patch 24 added new hooks to do arch-specific notification on addition/deletion of msi routes. Patch 25 is x86 specific, which added one more IEC notifier for msi routes. [Jan] - new patch 26: this is to partially solve the issue that Jan has encountered (1 sec delay when invalidating IR cache). v5 changes: - patch 10: add vector checking for IOAPIC interrupts (this may help debug in the future, will only generate warning if specify IOMMU_DEBUG) - patch 13: replace error_report() with a trace. [Jan] - patch 14: rename parameter "intr" to "intremap", to be aligned with kernel parameter [Jan] - patch 15: fix comments for vtd_iec_notify_fn - patch 17 & 18 (added): fix issue when IR enabled with devices using level-triggered interrupts, like e1000. Adding it to the end of series, since this issue never happen without IR. Patch 17 adds read-only check for IOAPIC entries. Patch 18 clears remote IRR bit when entry configured as edge-triggered. v4 changes (all patch number corresponds to v3): - add one patch at the start of v3 series: I missed to send the first patch in v3. adding it in. [Jan] - patch 9: add support for compatible mode (no reason not to support it, if not, we will get some warnings when using split irqchip) - patch 11: further simplify ioapic_update_kvm_routes() using the helper function. - patch 12: tweak on kvm_arch_fixup_msi_route() rather than ioapic_update_kvm_routes() only. [Radim] - add patch 15: introduce IEC (Interrupt Entry Cache) invalidation notifier list. We can register to this list if we want to be notified when we got IR invalidation requests [Radim] - add patch 16: let IOAPIC the first consumer for the above IEC notifier list. [Radim] - several other trivial fixes (like moving some defines from .c to .h, moving several lines of changes from one patch to another to make it make more sense, etc.) v3 changes (all patch numbers corresponds to v2): - patch 1 (-> v3 patch 13) - move to the end of series [Alex] - patch 10 (dropped) - drop this one, since re-worked on IOAPIC support, so we do not need this any more. - patch 12 (-> v3 patch 10) - leverage MSI path for IOAPIC IR [Jan] - patch 13 (v3 -> patch 9) - remove vtd_interrupt_remap_msi() declaration by reordering the functions [mst] - vtd_generate_msi_message(): init msg using {}, remove FIXME [mst] - new patches - v3 patch 11: introduce ioapic_entry_parse() helper function - v3 patch 12: add support for kernel-irqchip=split. This needs more reviews, logically this should enable lots of things: splitted irqchip, irqfd, vhost, and irqfd support for passthrough devices (not tested). Please refer to the patch for more information. v2 changes: - patch 1 - rename "int_remap" to "intr" in several places [Marcel] - remove "Intel" specific words in desc or commit message, prepare itself with further AMD support [Marcel] - avoid using object_property_get_bool() [Marcel] - patch 5 - use PCI bus number 0xff rather than 0xf0 for the IOAPIC scope definition. (please let me know if anyone knows how I can avoid user using PCI bus number 0xff... TIA) - patch 11 - fix comments [Marcel] - all - remove intr_supported variable [Marcel] This patchset provide interrupt remapping (IR) support of the emulated Intel IOMMU device. By default, IR is disabled to be better compatible with current QEMU. To enable IR, we can use the following command to boot a IR-supported VM with virtio-net device with vhost (do not support kvm-ioapic, so we need to specify kernel-irqchip={split|off} here): $ qemu-system-x86_64 -M q35,kernel-irqchip=split \ -device intel-iommu,intremap=on \ -enable-kvm -m 1024 \ -netdev tap,id=net0,vhost=on \ -device virtio-net-pci,netdev=user.0 \ -monitor telnet::3333,server,nowait \ /var/lib/libvirt/images/vm1.qcow2 When guest boots, we can verify whether IR enabled by grepping the dmesg like: Feb 19 11:21:23 localhost.localdomain kernel: DMAR-IR: IOAPIC id 0 under DRHD base 0xfed90000 IOMMU 0 Feb 19 11:21:23 localhost.localdomain kernel: DMAR-IR: Enabled IRQ remapping in x2apic mode Testing is only covering basic smoke test for the following matrix: - IR enabled/disable - kernel irqchip off/split - network device: tap with/without vhost, e1000 - vCPU count: 1/2 Currently supported: - Emulated/Splitted irqchip - Generic PCI Devices - vhost devices - pass through device support? Not tested, but suppose it should work. - IEC (Interrupt Entry Cache) cache invalidation notification - EIM (from Jan) - IRTE Source-id validation TODO List: - explicit IEC invalidation (currently, we do update without checking. Also, we can process QI invalidation in bulk, as Jan suggested) - IR fault reporting - migration support (for IOMMU as general?) - more? Jan Kiszka (1): intel_iommu: Add support for Extended Interrupt Mode Peter Xu (26): x86-iommu: introduce parent class intel_iommu: rename VTD_PCI_DEVFN_MAX to x86-iommu x86-iommu: provide x86_iommu_get_default x86-iommu: q35: generalize find_add_as() x86-iommu: introduce "intremap" property acpi: enable INTR for DMAR report structure intel_iommu: allow queued invalidation for IR intel_iommu: set IR bit for ECAP register acpi: add DMAR scope definition for root IOAPIC intel_iommu: define interrupt remap table addr register intel_iommu: handle interrupt remap enable intel_iommu: define several structs for IOMMU IR intel_iommu: add IR translation faults defines intel_iommu: Add support for PCI MSI remap q35: ioapic: add support for emulated IOAPIC IR ioapic: introduce ioapic_entry_parse() helper intel_iommu: add support for split irqchip x86-iommu: introduce IEC notifiers ioapic: register IOMMU IEC notifier for ioapic intel_iommu: add SID validation for IR kvm-irqchip: simplify kvm_irqchip_add_msi_route kvm-irqchip: i386: add hook for add/remove virq kvm-irqchip: x86: add msi route notify fn kvm-irqchip: do explicit commit when update irq kvm-all: add trace events for kvm irqchip ops intel_iommu: disallow kernel-irqchip=on with IR Radim Krčmář (1): intel_iommu: support all masks in interrupt entry cache invalidation Makefile.objs | 1 + hw/i386/Makefile.objs | 2 +- hw/i386/acpi-build.c | 43 +++- hw/i386/intel_iommu.c | 462 ++++++++++++++++++++++++++++++++++++-- hw/i386/intel_iommu_internal.h | 50 ++++- hw/i386/kvm/pci-assign.c | 10 +- hw/i386/pc.c | 3 + hw/i386/trace-events | 3 + hw/i386/x86-iommu.c | 128 +++++++++++ hw/intc/ioapic.c | 135 +++++++---- hw/misc/ivshmem.c | 4 +- hw/pci/pci.c | 15 ++ hw/vfio/pci.c | 12 +- hw/virtio/virtio-pci.c | 10 +- include/hw/acpi/acpi-defs.h | 13 ++ include/hw/i386/apic-msidef.h | 1 + include/hw/i386/intel_iommu.h | 175 ++++++++++++++- include/hw/i386/ioapic_internal.h | 3 + include/hw/i386/pc.h | 4 + include/hw/i386/x86-iommu.h | 103 +++++++++ include/hw/pci-host/q35.h | 8 + include/hw/pci/pci.h | 2 + include/sysemu/kvm.h | 21 +- kvm-all.c | 19 +- kvm-stub.c | 6 +- target-arm/kvm.c | 11 + target-i386/kvm.c | 109 ++++++++- target-i386/trace-events | 7 + target-mips/kvm.c | 11 + target-ppc/kvm.c | 11 + target-s390x/kvm.c | 11 + trace-events | 3 + 32 files changed, 1286 insertions(+), 110 deletions(-) create mode 100644 hw/i386/x86-iommu.c create mode 100644 include/hw/i386/x86-iommu.h create mode 100644 target-i386/trace-events -- 2.4.11