On 12/03/19 04:23, Peter Xu wrote: > On Mon, Mar 11, 2019 at 03:07:43PM +0100, Paolo Bonzini wrote: >> On 11/03/19 14:48, Sergio Lopez wrote: >>>> The initialization is O(n^2) because the guest initializes one device at >>>> a time, so you rebuild the FlatView first with 0 devices, then 1, then >>>> 2, etc. This is very hard to fix, if at all possible. >>>> >>>> However, each FlatView creation should be O(n) where n is the number of >>>> devices currently configured. Please check with "info mtree -f" that >>>> you only have a fixed number of FlatViews. Old versions had one per >>>> device. >>> I'm seeing 9 FVs with 1 PCI, and 119 with 100 PCIs. >> >> With >> >> $ eval qemu-system-x86_64 -M q35 \ >> -device\ e1000,id=n{1,2,3,4,5,6,7,8}{1,2,3} >> >> I only see 4 flat views ("system", "io", "memory", "(none)"). >> >> Probably you are using intel-iommu? Peter, it should be possible to >> reorganize the VT-d memory regions like this: >> >> intel_iommu_ir (MMIO, not added to any container) >> >> vtd_root_dmar (container) >> intel_iommu_dmar (IOMMU), priority 0 >> alias to intel_iommu_ir, priority 1 >> >> vtd_root_nodmar >> alias to get_system_memory(), priority 0 >> alias to intel_iommu_ir, priority 1 >> >> vtd_root_0 memory region (container) >> vtd_root_dmar # only one of these is enabled >> vtd_root_nodmar >> >> where the vtd_root_dmar and vtd_root_nodmar memory regions are created >> in vtd_init once and for all. Because all vtd_root_* memory regions >> have only one child, memory.c will recognize that they represent the >> same memory, and create at most two FlatViews (one for vtd_root_dmar, >> one for vtd_root_nodmar). > > Yes this sounds good. The only thing I'm still uncertain is about the > IOMMU notifiers, which should be per-device (for real).
You're right. However, the DMAR FlatView only has three sections so I suspect it's not a big deal if we keep it per-device. You'd still have O(n) flatviews when the IOMMU is present and DMAR is enabled, but they would have a constant number of sections so the cost overall is still O(n) and not O(n^2). If the IOMMU is present but DMAR is disabled, all VT-d address spaces would still share the same FlatView vtd_root_nodmar, and that is where the performance loss happens. The final scheme would be same as above with vtd_root_dmar replaced by vtd_root_dmar_%d. Paolo