Hi all, I'm currently looking at an issue with an NVMe device, which isn't working properly under some specific conditions.
The issue comes down to my platform having DMA addressing restrictions, with only 3 of the total 4GiB of RAM being device addressable, which means a bunch of DMA mappings are going through the SWIOTLB. Now with this NVMe device I'm getting a request with ~520KiB data payload. The system memory isn't heavily fragmented at that point yet, so the payload gets mapped a single dma segment in nvme_map_data(). Due to the addressing restrictions the request is passed to SWIOTLB, which is unable to satisfy the mapping request, despite plenty of TLB space being available due to the maximum segment size imposed by SWIOTLB. Currently a SWIOTLB slab is 2KiB (IO_TLB_SHIFT) in size, while the maximum segment size is IO_TLB_SEGSIZE = 128 slabs. This causes the dma mapping to fail, which means the blk layer will retry the request indefinitely. Now I can work around the issue at hand simply by bumping IO_TLB_SEGSIZE to 512, but this doesn't seem like a very robust solution. Do we need a SWIOTLB allocator that doesn't exhibit linear complexity with the maximum segment size? Some buddy scheme maybe? Splitting the dma segment doesn't seem to be an option, as the documentation states that dma_map_sg may return less segments as a result of the mapping operation, not more. I'm not sure how far this assumption is ingrained into the users of the API. Regards, Lucas _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu