On Tue, May 27, 2025 at 09:04:25PM -0300, Jason Gunthorpe wrote: > On Tue, May 27, 2025 at 01:54:28PM -0700, Shyam Saini wrote: > > > The above is the only place that creates a IOMMU_RESV_SW_MSI so it is > > > definately called and used, right? If not where does your > > > IOMMU_RESV_SW_MSI come from? > > > > code tracing and printks in that code path suggests > > iommu_dma_get_resv_regions() > > called by vfio-pci driver, > > Yes, I know it is, that is how it setups the SW_MSI. > > > > As above, I've asked a few times now if your resv_regions() is > > > correct, meaning there is a reserved range covering the address space > > > that doesn't have working translation. That means > > > iommu_get_resv_regions() returns such a range. > > > > sorry about missing that, i see msi iova being reserved: > > > > cat /sys/kernel/iommu_groups/*/reserved_regions > > 0x0000000008000000 0x00000000080fffff msi > > 0x0000000008000000 0x00000000080fffff msi > > 0x0000000008000000 0x00000000080fffff msi > > 0x0000000008000000 0x00000000080fffff msi > > [output trimmed] > > But this does not seem correct, you should have a "reserved" region > covering 0x8000000 as well because you say your platform cannot do DMA > to 0x8000000 and this is why you are doing all this. > > All IOVA that the platform cannot DMA from should be reported in the > reserved_regions file as "reserved". You must make your platform > achieve this.
so should it be for all the iommu groups? no_dma_mem { reg = <0x0 0x8000000 0x0 0x100000>; no-map; }; i think that's how we reserve memory in general, eg: ramoops but this doesn't show up in: /sys/kernel/iommu_groups/*/reserved_regions > > Yes, i tried that, > > > > This is how my dts node looked like > > reserved-memory { > > faulty_iova: resv_faulty { > > iommu-addresses = <&pcieX 0x8000000 0x100000>; > > }; > > .. > > .. > > } > > > > &pcieX { > > memory-region = <&faulty_iova>; > > }; > > > > I see it working for the devices which are calling > > iommu_get_resv_regions(), eg if I specify faulty_iova for dma > > controller dts node then i see an additional entry in the related > > group > > Exactly, it has to flow from the DT into the reserved_regions, that is > essential. > So what is the problem if you have figured out how to fix up > /sys/kernel/iommu_groups/Y/reserved_regions? sorry, i haven't yet > If you found some cases where you can't get /sys/../reserved_regions > to report the right things from the DT then that needs to be addressed > first before you think about fixing SW_MSI. > > I very vaguely recall we have some gaps on OF where the DMA-API code > is understanding parts of the DT that don't get mapped into > reserved_regions and nobody has cared to fix it because it only > effects VFIO. You may have landed in the seat that has to fix it :) I think this is the case we are dealing with? > But I still don't have a clear sense of what your actual problem is as > you are show DT that seems reasonable and saying that > /sys/../reserved_regions is working.. /sys/../reserved_regions is working for certain devices like dma controller but it doesn't work for pcie devices and its vfio-pcie driver calling iommu_get_resv_regions() but we don't have dts node for vfio. I have confirmed this about pcie on two different platforms, it seems to be OF DMA-API gap that you hinted above, happy to work on that :), it would be great if you can share any other reference discussions to that problem When i specify this for dma controller: faulty_iova: resv_faulty { iommu-addresses = <&dmaX 0x8000000 0x100000>; }; &dmaX { memory-region = <&faulty_iova>; }; I see following: $ cat /sys/kernel/iommu_groups/y/reserved_regions 0x0000000008000000 0x00000000080fffff reserved 0x00000000a0000000 0x00000000a00fffff msi Clarifying the Issue with MSI and SMMU Faults on Our Platform: We are encountering SMMU faults when using our userspace tool/driver that relies on MSI. Specifically, the issue arises when the MSI_IOVA_BASE is set to the current default value of 0x08000000. The observed fault is as follows: arm-smmu 64000000.iommu: Unhandled context fault: fsr=0x402, iova=0x00000040, fsynr=0x2f0013, cbfrsynra=0x102, cb=15 Upon investigation, our hardware team confirmed that the memory region containing 0x08000000 is already mapped for other peripherals, making it unavailable for MSI usage. eg: using 0xa0000000 as MSI_IOVA_BASE solves our problem. let me know if you have any other questions Thanks, Shyam