On Fri, 11 Apr 2025 15:13:26 +0200 Christian König <christian.koe...@amd.com> wrote:
> > > >> Background is that you don't get a crash, nor error message, nor > >> anything indicating what is happening. > > The job times out at some point, but we might get stuck in the fault > > handler waiting for memory, which is pretty close to a deadlock, I > > suspect. > > I don't know those drivers that well, but at least for amdgpu the > problem would be that the timeout handling would need to grab some of > the locks the memory management is holding waiting for the timeout > handling to do something.... > > So that basically perfectly closes the circle. With a bit of lock you > get a message after some time that the kernel is stuck, but since > that are all sleeping locks I strongly doubt so. > > As immediately action please provide patches which changes those > GFP_KERNEL into GFP_NOWAIT. Sure, I can do that.