xe: Stop abusing DRM scheduler internals

Philipp Stanner Mon, 08 Dec 2025 05:44:50 -0800

On Wed, 2025-12-03 at 12:44 -0800, Matthew Brost wrote:
> On Wed, Dec 03, 2025 at 11:56:01AM +0100, Philipp Stanner wrote:
> > On Mon, 2025-12-01 at 10:39 -0800, Matthew Brost wrote:
> > > Use new pending job list iterator and new helper functions in Xe to
> > > avoid reaching into DRM scheduler internals.
> > >


[…]

> > > Signed-off-by: Matthew Brost <[email protected]>
> > > Reviewed-by: Niranjana Vishwanathapura 
> > > <[email protected]>
> > > ---
> > >  drivers/gpu/drm/xe/xe_gpu_scheduler.c    |  4 +-
> > >  drivers/gpu/drm/xe/xe_gpu_scheduler.h    | 33 ++--------
> > >  drivers/gpu/drm/xe/xe_guc_submit.c       | 81 ++++++------------------
> > >  drivers/gpu/drm/xe/xe_guc_submit_types.h | 11 ----
> > >  drivers/gpu/drm/xe/xe_hw_fence.c         | 16 -----
> > >  drivers/gpu/drm/xe/xe_hw_fence.h         |  2 -
> > >  6 files changed, 27 insertions(+), 120 deletions(-)
> > > 
> > > diff --git a/drivers/gpu/drm/xe/xe_gpu_scheduler.c 
> > > b/drivers/gpu/drm/xe/xe_gpu_scheduler.c
> > > index f4f23317191f..9c8004d5dd91 100644
> > > --- a/drivers/gpu/drm/xe/xe_gpu_scheduler.c
> > > +++ b/drivers/gpu/drm/xe/xe_gpu_scheduler.c
> > > @@ -7,7 +7,7 @@
> > >  
> > >  static void xe_sched_process_msg_queue(struct xe_gpu_scheduler *sched)
> > >  {
> > > - if (!READ_ONCE(sched->base.pause_submit))
> > > + if (!drm_sched_is_stopped(&sched->base))
> > >           queue_work(sched->base.submit_wq, &sched->work_process_msg);
> > 
> > Sharing the submit_wq is legal. But next-level cleanness would be if
> > struct drm_gpu_scheduler's internal components wouldn't be touched.
> > That's kind of a luxury request, though.
> > 
> 
> Yes, perhaps a helper to extract the submit_wq too.

Could work; but best would be if driver's store their own pointer.
That's not always possible (Boris recently tried to do it for Panthor),
but often it is.

> 
> > >  }
> > >  
> > > @@ -43,7 +43,7 @@ static void xe_sched_process_msg_work(struct 
> > > work_struct *w)
> > >           container_of(w, struct xe_gpu_scheduler, work_process_msg);
> > >   struct xe_sched_msg *msg;
> > >  
> > > - if (READ_ONCE(sched->base.pause_submit))
> > > + if (drm_sched_is_stopped(&sched->base))
> > >           return;
> > >  
> > >   msg = xe_sched_get_msg(sched);
> > > diff --git a/drivers/gpu/drm/xe/xe_gpu_scheduler.h 
> > > b/drivers/gpu/drm/xe/xe_gpu_scheduler.h
> > > index dceb2cd0ee5b..664c2db56af3 100644
> > > --- a/drivers/gpu/drm/xe/xe_gpu_scheduler.h
> > > +++ b/drivers/gpu/drm/xe/xe_gpu_scheduler.h
> > > @@ -56,12 +56,9 @@ static inline void xe_sched_resubmit_jobs(struct 
> > > xe_gpu_scheduler *sched)
> > >   struct drm_sched_job *s_job;
> > >   bool restore_replay = false;
> > >  
> > > - list_for_each_entry(s_job, &sched->base.pending_list, list) {
> > > -         struct drm_sched_fence *s_fence = s_job->s_fence;
> > > -         struct dma_fence *hw_fence = s_fence->parent;
> > > -
> > > + drm_sched_for_each_pending_job(s_job, &sched->base, NULL) {
> > >           restore_replay |= to_xe_sched_job(s_job)->restore_replay;
> > > -         if (restore_replay || (hw_fence && 
> > > !dma_fence_is_signaled(hw_fence)))
> > > +         if (restore_replay || !drm_sched_job_is_signaled(s_job))
> > 
> > So that's where this function is needed. You check whether that job in
> > the pending_list is signaled. 
> > 
> 
> Yes, during GT reset flows (think a device level reset) it is possible
> we stop the scheduler between the window of a job signaling but before
> free_job is called. We want avoid resubmission of jobs which have
> signaled.

I'm not so convinced then that the function should be called
drm_sched_job_is_signaled(). A job is also associated with
s_fence.finished.

Why is it that that can race here, btw. – isn't it your driver which
signals the hardware fences? How and where? Interrupts?

> 
> > >                   sched->base.ops->run_job(s_job);
> > 
> > Aaaaaahm. So you invoke your own callback. But basically just to access
> > the function pointer I suppose?
> > 
> > Since this is effectively your drm_sched_resubmit_jobs(), it is
> > definitely desirable to provide a text book example of how to do resets
> > so that others can follow your usage.
> > 
> 
> Yes, but drm_sched_resubmit_jobs() does some nonsense with dma-fence
> that I don’t need here. Honestly, I’m a little unsure what that is
> actually doing.
> 

Resubmit jobs shouldn't be used anymore, it's depercated. What I mean
is that your code here effectively is the resubmission code. So if you
think that it's the right way of doing resets, in harmony with
drm_sched, then it would be good if this code is as tidy as possible
and preferably even commented, so that we can point other driver
programmers to this as an example of idiomatic usage.


>  We also use this function during VF restore after
> migration. This is a multi-step process that needs to operate on the
> same set of jobs at each step of the restore. That’s what the
> restore_replay variable represents—it marks a job at the very beginning
> of the restore process, and each step along the way ensures execution
> starts at that job. Techincally once we here in a VF restore jobs can
> start signaling as the hardware is live. So some of this really is
> vendor-specific.
> 
> > Can't you replace ops->run_job() with a call to your functions where
> > you push the jobs to the ring, directly?
> > 
> 
> Yes, we could, but that function isn’t currently exported. Also, in
> future products, we may assign a different run_job vfunc based on
> hardware generation or queue type. So using a vfunc here makes sense as
> a bit of future-proofing. Of course, we could also have a DRM
> scheduler-level helper that invokes run_job for us.

OK.

But same comment as above applies, having distinct pointers would be the 
cleanest thing.


P.

Re: [PATCH v7 4/9] drm/xe: Stop abusing DRM scheduler internals

Reply via email to