> >>> diff --git a/kernel/dma/direct.c b/kernel/dma/direct.c > >>> index 0a4881e59aa7..3d9b17fe5771 100644 > >>> --- a/kernel/dma/direct.c > >>> +++ b/kernel/dma/direct.c > >>> @@ -374,9 +374,11 @@ void dma_direct_unmap_sg(struct device *dev, struct > >>> scatterlist *sgl, > >>> struct scatterlist *sg; > >>> int i; > >>> > >>> - for_each_sg(sgl, sg, nents, i) > >>> + for_each_sg(sgl, sg, nents, i) { > >>> dma_direct_unmap_page(dev, sg->dma_address, > >>> sg_dma_len(sg), dir, > >>> attrs); > >>> + sg->dma_address = DMA_MAPPING_ERROR; > >> > >> There are more DMA API backends than just dma-direct, so while this > >> might help paper over bugs when SWIOTLB is in use, it's not going to > >> have any effect when those same bugs are hit under other circumstances. > >> Once again, the moral of the story is that effort is better spent just > >> fixing the bugs ;) > > > > Thanks for the quick feedback. What is the correct fix? I understand > > the first half. The NVMe driver should be updated to not call unmap on > > an address that has already been unmapped within the DMA direct code. > > Where I'm less certain is how to communicate to the NVMe driver that > > the mapping failed. In particular, the NVMe code explicitly checks if > > the first DMA address in the scatter/gather list is set to > > DMA_MAPPING_ERROR. Thus, don't we need to update the DMA direct code > > to propagate DMA_MAPPING_ERROR back up to the driver, via the > > scatter/gather struct? > > Erm, you check the return value of dma_map_sg(). If it's zero, the > request failed; if it's nonzero, that's how many DMA segments you now > have to process. See Documentation/core-api/dma-api.rst. > > The only guarantee offered about the state of the scatterlist itself is > that if it is successfully mapped, then the dma_address and dma_length > fields are valid for that many segments, and if that is fewer than the > total number of physical segments then the next one after the final DMA > segment will have a dma_length of 0. In particular there are no > guarantees at all about the state if the mapping was unsuccessful. > > If a driver is failing to keep track of the success/failure status and > later down the line trying to guess what to do with a list that may or > may not have been mapped, then frankly that driver should be redesigned > because that is a terrible anti-pattern. At the very very least it > should explicitly encode its own "known bad" state upon failure that it > can then reliably recognise later.
Got it now. I'll get to work on a patch for the NVMe driver to fix the bug. Thanks for all of these pointers. They are immensely helpful. Thanks, Marc _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu