..snip.. > > There doesn't seem to be much good reason for SWIOTLB to be more special > > than other DMA API backends, and not all of them have enough internal state > > to > > be able to make such a check. It's also not necessarily possible to "prevent > > damage" anyway - if a driver does pass a bogus size for > > dma_unmap_single(..., > > DMA_FROM_DEVICE), SWIOTLB might be able to keep itself internally > > consistent, > > but it still can't prevent the arch code in the middle from invalidating > > the wrong > > cache lines and potentially corrupting adjacent memory. > > > > In short, trying to work around broken drivers is a much worse idea than > > just > > fixing those drivers, and that's what we already have dma-debug for. > > > > Robin. > > Hi Robin, > > I agree that hacking kernel to fix broken drivers is not acceptable, actually > we spent days to fight driver supplier with this, they do not want to change > their code and want fix it directly in kernel.
You could upstream the driver? That way it will be fixed. I am not sure if you are aware but there is a staging directory as well in Linux. > > I tried Dma-debug yesterday, it works very well, but I think only the size > mismatch check may not be enough for the map entry corrupt situation, some > run-time warning may be better when the real corruption happen. > > For most of the dma-api backend, the size mismatch may do no harm at all, and > even in SWIOTLB itself when the bounce buffer is not used, the size mismatch > do no harm either. In our case, the same buggy driver works well when board > has 2G DDR, but panic frequently in 4G DDR because of the use of bounce > buffer and these corrupted map entries. it is hard to catch this kind of > bugs, for when the corruption happen, the kernel has all kind of reasons to > panic, but not even one may directly point to the real source. > > Add the warning messages is a big convenience for figure this kind of issues, > at least to me and the AP driver supplier, such warnings may save weeks of > hard debug time. I would prefer that all of those warning messages go in the DMA debug API - as you can have multiple DMA different backends. In other words, instead of being in one implementation it would make much more sense to be in the generic API layer.