On Thu, 12 Jun 2025 12:18:40 +0200 Jiri Bohac <jbo...@suse.cz> wrote:
> When re-using the CMA area for kdump there is a risk of pending DMA > into pinned user pages in the CMA area. > > Pages residing in CMA areas can usually not get long-term pinned and > are instead migrated away from the CMA area, so long-term pinning is > typically not a concern. (BUGs in the kernel might still lead to > long-term pinning of such pages if everything goes wrong.) > > Pages pinned without FOLL_LONGTERM remain in the CMA and may possibly > be the source or destination of a pending DMA transfer. > > Although there is no clear specification how long a page may be pinned > without FOLL_LONGTERM, pinning without the flag shows an intent of the > caller to only use the memory for short-lived DMA transfers, not a transfer > initiated by a device asynchronously at a random time in the future. > > Add a delay of CMA_DMA_TIMEOUT_SEC seconds before starting the kdump > kernel, giving such short-lived DMA transfers time to finish before > the CMA memory is re-used by the kdump kernel. > > Set CMA_DMA_TIMEOUT_SEC to 10 seconds - chosen arbitrarily as both > a huge margin for a DMA transfer, yet not increasing the kdump time > too significantly. Oh. 10s sounds a lot. How long does this process typically take? It's sad to add a 10s delay for something which some systems will never do. I wonder if there's some simple hack we can add. Like having a global flag which gets set the first time someone pins a CMA page for DMA and, if that flag is later found to be unset, skip the delay? Or something else along these lines?