> > > > Hi, > > > > > > > > The qemu process will stuck when hot-add large size memory to > > > > the virtual machine with a device passtrhough. > > > > We found it is too slow to pin and map pages in vfio_dma_do_map. > > > > Is there any method to improve this process? > > > > > > At what size do you start to see problems? The time to map a > > > section of memory should be directly proportional to the size. As > > > the size is increased, it will take longer, but I don't know why > > > you'd reach a point of not making forward progress. Is it actually > > > stuck or is it just taking longer than you want? Using hugepages > > > can certainly help, we still need to pin each PAGE_SIZE page within > > > the hugepage, but we'll have larger contiguous regions and therefore > > > call iommu_map() less frequently. Please share more data. Thanks, > > > > > > Alex > > It just take longer time, instead of actually stuck. > > We found that the problem exist when we hot-added 16G memory. And it > > will consume tens of minutes when we hot-added 1T memory. > > Is the stall adding 1TB roughly 64 times the stall adding 16GB or do we > have some inflection in the size vs time curve? There is a cost to > pinning an mapping through the IOMMU, perhaps we can improve that, but I > don't see how we can eliminate it or how it wouldn't be at least linear > compared to the size of memory added without moving to a page request > model, which hardly any hardware currently supports. A workaround might > be to incrementally add memory in smaller chunks which generate a less > noticeable stall. Thanks, > > Alex I collected a part of report as below recorded by perf when I hot-added 24GB memory: + 63.41% 0.00% qemu-kvm qemu-kvm-2.8.1-25.127 [.] 0xffffffffffc7534a + 63.41% 0.00% qemu-kvm [kernel.vmlinux] [k] do_vfs_ioctl + 63.41% 0.00% qemu-kvm [kernel.vmlinux] [k] sys_ioctl + 63.41% 0.00% qemu-kvm libc-2.17.so [.] __GI___ioctl + 63.41% 0.00% qemu-kvm qemu-kvm-2.8.1-25.127 [.] 0xffffffffffc71c59 + 63.10% 0.00% qemu-kvm [vfio] [k] vfio_fops_unl_ioctl + 63.10% 0.00% qemu-kvm qemu-kvm-2.8.1-25.127 [.] 0xffffffffffcbbb6a + 63.10% 0.02% qemu-kvm [vfio_iommu_type1] [k] vfio_iommu_type1_ioctl + 60.67% 0.31% qemu-kvm [vfio_iommu_type1] [k] vfio_pin_pages_remote + 60.06% 0.46% qemu-kvm [vfio_iommu_type1] [k] vaddr_get_pfn + 59.61% 0.95% qemu-kvm [kernel.vmlinux] [k] get_user_pages_fast + 54.28% 0.02% qemu-kvm [kernel.vmlinux] [k] get_user_pages_unlocked + 54.24% 0.04% qemu-kvm [kernel.vmlinux] [k] __get_user_pages + 54.13% 0.01% qemu-kvm [kernel.vmlinux] [k] handle_mm_fault + 54.08% 0.03% qemu-kvm [kernel.vmlinux] [k] do_huge_pmd_anonymous_page + 52.09% 52.09% qemu-kvm [kernel.vmlinux] [k] clear_page + 9.42% 0.12% swapper [kernel.vmlinux] [k] cpu_startup_entry + 9.20% 0.00% swapper [kernel.vmlinux] [k] start_secondary + 8.85% 0.02% swapper [kernel.vmlinux] [k] arch_cpu_idle + 8.79% 0.07% swapper [kernel.vmlinux] [k] cpuidle_idle_call + 6.16% 0.29% swapper [kernel.vmlinux] [k] apic_timer_interrupt + 5.73% 0.07% swapper [kernel.vmlinux] [k] smp_apic_timer_interrupt + 4.34% 0.99% qemu-kvm [kernel.vmlinux] [k] gup_pud_range + 3.56% 0.16% swapper [kernel.vmlinux] [k] local_apic_timer_interrupt + 3.32% 0.41% swapper [kernel.vmlinux] [k] hrtimer_interrupt + 3.25% 3.21% qemu-kvm [kernel.vmlinux] [k] gup_huge_pmd + 2.31% 0.01% qemu-kvm [kernel.vmlinux] [k] iommu_map + 2.30% 0.00% qemu-kvm [kernel.vmlinux] [k] intel_iommu_map
It seems that the bottleneck is trying to pin pages through get_user_pages instead of do iommu mapping. Thanks, Wu Zongyong _______________________________________________ vfio-users mailing list vfio-users@redhat.com https://www.redhat.com/mailman/listinfo/vfio-users