On Tue, 2025-07-08 at 00:22 -0700, Matthew Brost wrote: > On Mon, Jul 07, 2025 at 11:46:36AM -0300, Maíra Canal wrote: > > Xe can skip the reset if TDR has fired before the free job worker > > and can > > also re-arm the timeout timer in some scenarios. Instead of > > manipulating > > scheduler's internals, inform the scheduler that the job did not > > actually > > timeout and no reset was performed through the new status code > > DRM_GPU_SCHED_STAT_NO_HANG. > > > > Note that, in the first case, there is no need to restart > > submission if it > > hasn't been stopped. > > > > Signed-off-by: Maíra Canal <mca...@igalia.com> > > I'm fairly certain this is correct. However, Intel's CI didn't run > with > your latest series. Can you resubmit and ensure a clean CI run before > merging?
How can someone who's not at Intel ensure that? P. > CI can be a bit flaky—if you get some failures, ping me and > I’ll let you know if they're related to this patch. > > With clean CI: > Reviewed-by: Matthew Brost matthew.br...@intel.com > > > --- > > drivers/gpu/drm/xe/xe_guc_submit.c | 12 +++--------- > > 1 file changed, 3 insertions(+), 9 deletions(-) > > > > diff --git a/drivers/gpu/drm/xe/xe_guc_submit.c > > b/drivers/gpu/drm/xe/xe_guc_submit.c > > index > > 9c7e445b9ea7ce7e3610eadca023e6d810e683e9..f6289eeffd852e40b33d0e455 > > d9bcc21a4fb1467 100644 > > --- a/drivers/gpu/drm/xe/xe_guc_submit.c > > +++ b/drivers/gpu/drm/xe/xe_guc_submit.c > > @@ -1078,12 +1078,8 @@ guc_exec_queue_timedout_job(struct > > drm_sched_job *drm_job) > > * list so job can be freed and kick scheduler ensuring > > free job is not > > * lost. > > */ > > - if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &job->fence- > > >flags)) { > > - xe_sched_add_pending_job(sched, job); > > - xe_sched_submission_start(sched); > > - > > - return DRM_GPU_SCHED_STAT_RESET; > > - } > > + if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &job->fence- > > >flags)) > > + return DRM_GPU_SCHED_STAT_NO_HANG; > > > > /* Kill the run_job entry point */ > > xe_sched_submission_stop(sched); > > @@ -1261,10 +1257,8 @@ guc_exec_queue_timedout_job(struct > > drm_sched_job *drm_job) > > * but there is not currently an easy way to do in DRM > > scheduler. With > > * some thought, do this in a follow up. > > */ > > - xe_sched_add_pending_job(sched, job); > > xe_sched_submission_start(sched); > > - > > - return DRM_GPU_SCHED_STAT_RESET; > > + return DRM_GPU_SCHED_STAT_NO_HANG; > > } > > > > static void __guc_exec_queue_fini_async(struct work_struct *w) > > > > -- > > 2.50.0 > >