On Fri, Jul 4, 2025 at 7:46 AM Danilo Krummrich <d...@kernel.org> wrote: > > On 7/3/25 1:27 AM, Dave Airlie wrote: > > From: Dave Airlie <airl...@redhat.com> > > > > This fixes a bunch of command hangs after runtime suspend/resume. > > > > This fixes a regression caused by code movement in the commit below, > > the commit seems to just change timings enough to cause this to happen > > now, and adding the sleep seems to avoid it. > > > > I've spent some time trying to root cause it to no great avail, > > it seems like a bug on the firmware side, but it could be a bug > > in our rpc handling that I can't find. > > > > Either way, we should land the workaround to fix the problem, > > while we continue to work out the root cause. > > I think we should add a TODO above the msleep(); what do you think would be a > good comment here?
TODO: debug the gsp firmware or the rpc handling to find out why this is happening and why it's Turing specific. Don't really have a lot to go on, Dave. > > I can add it when applying the patch if you want. > > > Signed-off-by: Dave Airlie <airl...@redhat.com> > > Cc: Ben Skeggs <bske...@nvidia.com> > > Cc: Danilo Krummrich <d...@kernel.org> > > Fixes: 21b039715ce9 ("drm/nouveau/gsp: add hals for fbsr.suspend/resume()") > > --- > > drivers/gpu/drm/nouveau/nvkm/subdev/gsp/rm/r535/gsp.c | 3 +++ > > 1 file changed, 3 insertions(+) > > > > diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/rm/r535/gsp.c > > b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/rm/r535/gsp.c > > index baf42339f93e..ff362a6d9f5c 100644 > > --- a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/rm/r535/gsp.c > > +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/rm/r535/gsp.c > > @@ -1744,6 +1744,9 @@ r535_gsp_fini(struct nvkm_gsp *gsp, bool suspend) > > nvkm_gsp_sg_free(gsp->subdev.device, &gsp->sr.sgt); > > return ret; > > } > > + > > + /* without this Turing ends up resetting all channels after > > resume. */ > > + msleep(50); > > } > > > > ret = r535_gsp_rpc_unloading_guest_driver(gsp, suspend); >