On Sat, May 03, 2025 at 05:59:52PM -0300, Maíra Canal wrote: > When the DRM scheduler times out, it's possible that the GPU isn't hung; > instead, a job may still be running, and there may be no valid reason to > reset the hardware. This can occur in two situations: > > 1. The GPU exposes some mechanism that ensures the GPU is still making > progress. By checking this mechanism, we can safely skip the reset, > rearm the timeout, and allow the job to continue running until > completion. This is the case for v3d and Etnaviv. > 2. TDR has fired before the IRQ that signals the fence. Consequently, > the job actually finishes, but it triggers a timeout before signaling > the completion fence. >
We have both of these cases in Xe too. We implement the requeuing in Xe via driver side function - xe_sched_add_pending_job but this looks better and will make use of this. > These two scenarios are problematic because we remove the job from the > `sched->pending_list` before calling `sched->ops->timedout_job()`. This > means that when the job finally signals completion (e.g. in the IRQ > handler), the scheduler won't call `sched->ops->free_job()`. As a result, > the job and its resources won't be freed, leading to a memory leak. > > To resolve this issue, we create a new `drm_gpu_sched_stat` that allows a > driver to skip the reset. This new status will indicate that the job > should be reinserted into the pending list, and the driver will still > signal its completion. > > Signed-off-by: Maíra Canal <mca...@igalia.com> Reviewed-by: Matthew Brost <matthew.br...@intel.com> > --- > drivers/gpu/drm/scheduler/sched_main.c | 14 ++++++++++++++ > include/drm/gpu_scheduler.h | 2 ++ > 2 files changed, 16 insertions(+) > > diff --git a/drivers/gpu/drm/scheduler/sched_main.c > b/drivers/gpu/drm/scheduler/sched_main.c > index > 829579c41c6b5d8b2abce5ad373c7017469b7680..68ca827d77e32187a034309f881135dbc639a9b4 > 100644 > --- a/drivers/gpu/drm/scheduler/sched_main.c > +++ b/drivers/gpu/drm/scheduler/sched_main.c > @@ -568,6 +568,17 @@ static void drm_sched_job_timedout(struct work_struct > *work) > job->sched->ops->free_job(job); > sched->free_guilty = false; > } > + > + /* > + * If the driver indicated that the GPU is still running and > wants to skip > + * the reset, reinsert the job back into the pending list and > realarm the > + * timeout. > + */ > + if (status == DRM_GPU_SCHED_STAT_RUNNING) { > + spin_lock(&sched->job_list_lock); > + list_add(&job->list, &sched->pending_list); > + spin_unlock(&sched->job_list_lock); > + } > } else { > spin_unlock(&sched->job_list_lock); > } > @@ -590,6 +601,9 @@ static void drm_sched_job_timedout(struct work_struct > *work) > * This function is typically used for reset recovery (see the docu of > * drm_sched_backend_ops.timedout_job() for details). Do not call it for > * scheduler teardown, i.e., before calling drm_sched_fini(). > + * > + * As it's used for reset recovery, drm_sched_stop() shouldn't be called > + * if the scheduler skipped the timeout (DRM_SCHED_STAT_RUNNING). > */ > void drm_sched_stop(struct drm_gpu_scheduler *sched, struct drm_sched_job > *bad) > { > diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h > index > 1a7e377d4cbb4fc12ed93c548b236970217945e8..fe9043b6d43141bee831b5fc16b927202a507d51 > 100644 > --- a/include/drm/gpu_scheduler.h > +++ b/include/drm/gpu_scheduler.h > @@ -389,11 +389,13 @@ struct drm_sched_job { > * @DRM_GPU_SCHED_STAT_NONE: Reserved. Do not use. > * @DRM_GPU_SCHED_STAT_NOMINAL: Operation succeeded. > * @DRM_GPU_SCHED_STAT_ENODEV: Error: Device is not available anymore. > + * @DRM_GPU_SCHED_STAT_RUNNING: GPU is still running, so skip the reset. > */ > enum drm_gpu_sched_stat { > DRM_GPU_SCHED_STAT_NONE, > DRM_GPU_SCHED_STAT_NOMINAL, > DRM_GPU_SCHED_STAT_ENODEV, > + DRM_GPU_SCHED_STAT_RUNNING, > }; > > /** > > -- > 2.49.0 >