On Wed, Jan 29, 2025 at 10:58:00AM -0400, Jason Gunthorpe wrote:
> On Wed, Jan 29, 2025 at 02:44:12PM +0100, Eric Auger wrote:
> > On 1/11/25 4:32 AM, Nicolin Chen wrote:
> > > For systems that require MSI pages to be mapped into the IOMMU translation
> > > the IOMMU driver provides an IOMMU_RESV_SW_MSI range, which is the default
> > > recommended IOVA window to place these mappings. However, there is nothing
> > > special about this address. And to support the RMR trick in VMM for nested
> > well at least it shall not overlap VMM's RAM. So it was not random either.
> > > translation, the VMM needs to know what sw_msi window the kernel is using.
> > > As there is no particular reason to force VMM to adopt the kernel default,
> > > provide a simple IOMMU_OPTION_SW_MSI_START/SIZE ioctl that the VMM can use
> > > to directly specify the sw_msi window that it wants to use, which replaces
> > > and disables the default IOMMU_RESV_SW_MSI from the driver to avoid having
> > > to build an API to discover the default IOMMU_RESV_SW_MSI.
> > IIUC the MSI window will then be different when using legacy VFIO
> > assignment and iommufd backend.
> 
> ? They use the same, iommufd can have userspace override it. Then it
> will ignore the reserved region.
> 
> > MSI reserved regions are exposed in
> > /sys/kernel/iommu_groups/<n>/reserved_regions
> > 0x0000000008000000 0x00000000080fffff msi
>  
> > Is that configurability reflected accordingly?
> 
> ?
> 
> Nothing using iommufd should parse that sysfs file.
>  
> > How do you make sure it does not collide with other resv regions? I
> > don't see any check here.
> 
> Yes this does need to be checked, it does look missing. It still needs
> to create a reserved region in the ioas when attaching to keep the
> areas safe and it has to intersect with the incoming reserved
> regions from the driver.

Yea, I found iopt_reserve_iova() is actually missed entirely...

While fixing this, I see a way to turn the OPTIONs back to per-
idev, if you still prefer them to be per-idev(?). Then, we can
check a given input in the set_option() against the device's
reserved region list from the driver, prior to device attaching
to any HWPT.

Otherwise, we just rely on iopt_enforce_device_reserve_region()
during an attach, keeping the option global to simplify VMMs.

Thanks
Nicolin

Reply via email to