On 17/11/2023 9:47 am, Roger Pau Monne wrote: > The current loop that iterates from 0 to the maximum RAM address in order to > setup the IOMMU mappings is highly inefficient, and it will get worse as the > amount of RAM increases. It's also not accounting for any reserved regions > past the last RAM address. > > Instead of iterating over memory addresses, iterate over the memory map > regions > and use a rangeset in order to keep track of which ranges need to be identity > mapped in the hardware domain physical address space. > > On an AMD EPYC 7452 with 512GiB of RAM, the time to execute > arch_iommu_hwdom_init() in nanoseconds is: > > x old > + new > N Min Max Median Avg Stddev > x 5 2.2364154e+10 2.338244e+10 2.2474685e+10 2.2622409e+10 4.2949869e+08 > + 5 1025012 1033036 1026188 1028276.2 3623.1194 > Difference at 95.0% confidence > -2.26214e+10 +/- 4.42931e+08 > -99.9955% +/- 9.05152e-05% > (Student's t, pooled s = 3.03701e+08) > > Execution time of arch_iommu_hwdom_init() goes down from ~22s to ~0.001s. > > Note there's a change for HVM domains (ie: PVH dom0) that get switched to > create the p2m mappings using map_mmio_regions() instead of > p2m_add_identity_entry(), so that ranges can be mapped with a single function > call if possible. Note that the interface of map_mmio_regions() doesn't > allow creating read-only mappings, but so far there are no such mappings > created for PVH dom0 in arch_iommu_hwdom_init(). > > No change intended in the resulting mappings that a hardware domain gets. > > Signed-off-by: Roger Pau Monné <roger....@citrix.com>
Very nice numbers. And yes - straight line performance like this (good or bad) is all about the innermost loop. Sadly, the patch diff is horrible to read. Patch 2 remaining in common code will improve this a little, but probably not very much. If there are no better ideas, it's probably best to split into 3 patches, being: 1) Introduce new rangeset forms of existing operations 2) Rewrite arch_iommu_hwdom_init() to use rangesets 3) Delete old mfn forms That at least means that the new and the old forms aren't expressed as a delta against each-other. ~Andrew