On Fri, 2025-07-18 at 10:35 +0100, Tvrtko Ursulin wrote:
> 
> On 18/07/2025 10:31, Philipp Stanner wrote:
> > On Fri, 2025-07-18 at 08:13 +0100, Tvrtko Ursulin wrote:
> > > 
> > > On 16/07/2025 21:44, Maíra Canal wrote:
> > > > Hi Tvrtko,
> > > > 
> > > > On 16/07/25 11:46, Tvrtko Ursulin wrote:
> > > > > 
> > > > > On 16/07/2025 15:30, Maíra Canal wrote:
> > > > > > Hi Tvrtko,
> > > > > > 
> > > > > > On 16/07/25 10:49, Tvrtko Ursulin wrote:
> > > > > > > 
> > > > > > > On 16/07/2025 14:31, Maíra Canal wrote:
> > > > > > > > Hi Tvrtko,
> > > > > > > > 
> > > > > > > > On 16/07/25 05:51, Tvrtko Ursulin wrote:
> > > > > > > > > Currently the job free work item will lock sched-
> > > > > > > > > >job_list_lock
> > > > > > > > > first time
> > > > > > > > > to see if there are any jobs, free a single job, and
> > > > > > > > > then lock
> > > > > > > > > again to
> > > > > > > > > decide whether to re-queue itself if there are more
> > > > > > > > > finished jobs.
> > > > > > > > > 
> > > > > > > > > Since drm_sched_get_finished_job() already looks at
> > > > > > > > > the second job
> > > > > > > > > in the
> > > > > > > > > queue we can simply add the signaled check and have
> > > > > > > > > it return the
> > > > > > > > > presence
> > > > > > > > > of more jobs to be freed to the caller. That way the
> > > > > > > > > work item
> > > > > > > > > does not
> > > > > > > > > have to lock the list again and repeat the signaled
> > > > > > > > > check.
> > > > > > > > > 
> > > > > > > > > Signed-off-by: Tvrtko Ursulin
> > > > > > > > > <tvrtko.ursu...@igalia.com>
> > > > > > > > > Cc: Christian König <christian.koe...@amd.com>
> > > > > > > > > Cc: Danilo Krummrich <d...@kernel.org>
> > > > > > > > > Cc: Maíra Canal <mca...@igalia.com>
> > > > > > > > > Cc: Matthew Brost <matthew.br...@intel.com>
> > > > > > > > > Cc: Philipp Stanner <pha...@kernel.org>
> > > > > > > > > ---
> > > > > > > > > v2:
> > > > > > > > >    * Improve commit text and kerneldoc. (Philipp)
> > > > > > > > >    * Rename run free work helper. (Philipp)
> > > > > > > > > 
> > > > > > > > > v3:
> > > > > > > > >    * Rebase on top of Maira's changes.
> > > > > > > > > ---
> > > > > > > > >    drivers/gpu/drm/scheduler/sched_main.c | 53
> > > > > > > > > +++++++++
> > > > > > > > > +----------------
> > > > > > > > >    1 file changed, 21 insertions(+), 32 deletions(-)
> > > > > > > > > 
> > > > > > > > > diff --git a/drivers/gpu/drm/scheduler/sched_main.c
> > > > > > > > > b/drivers/gpu/
> > > > > > > > > drm/ scheduler/sched_main.c
> > > > > > > > > index e2cda28a1af4..5a550fd76bf0 100644
> > > > > > > > > --- a/drivers/gpu/drm/scheduler/sched_main.c
> > > > > > > > > +++ b/drivers/gpu/drm/scheduler/sched_main.c
> > > > > > > > > @@ -349,34 +349,13 @@ static void
> > > > > > > > > drm_sched_run_job_queue(struct
> > > > > > > > > drm_gpu_scheduler *sched)
> > > > > > > > >    }
> > > > > > > > >    /**
> > > > > > > > > - * __drm_sched_run_free_queue - enqueue free-job
> > > > > > > > > work
> > > > > > > > > - * @sched: scheduler instance
> > > > > > > > > - */
> > > > > > > > > -static void __drm_sched_run_free_queue(struct
> > > > > > > > > drm_gpu_scheduler
> > > > > > > > > *sched)
> > > > > > > > > -{
> > > > > > > > > -    if (!READ_ONCE(sched->pause_submit))
> > > > > > > > > -        queue_work(sched->submit_wq, &sched-
> > > > > > > > > >work_free_job);
> > > > > > > > > -}
> > > > > > > > > -
> > > > > > > > > -/**
> > > > > > > > > - * drm_sched_run_free_queue - enqueue free-job work
> > > > > > > > > if ready
> > > > > > > > > + * drm_sched_run_free_queue - enqueue free-job work
> > > > > > > > >     * @sched: scheduler instance
> > > > > > > > >     */
> > > > > > > > >    static void drm_sched_run_free_queue(struct
> > > > > > > > > drm_gpu_scheduler
> > > > > > > > > *sched)
> > > > > > > > >    {
> > > > > > > > > -    struct drm_sched_job *job;
> > > > > > > > > -
> > > > > > > > > -    job = list_first_entry_or_null(&sched-
> > > > > > > > > >pending_list,
> > > > > > > > > -                       struct drm_sched_job, list);
> > > > > > > > > -    if (job && dma_fence_is_signaled(&job->s_fence-
> > > > > > > > > >finished))
> > > > > > > > > -        __drm_sched_run_free_queue(sched);
> > > > > > > > 
> > > > > > > > I believe we'd still need this chunk for
> > > > > > > > DRM_GPU_SCHED_STAT_NO_HANG
> > > > > > > > (check the comment in
> > > > > > > > drm_sched_job_reinsert_on_false_timeout()). How
> > > > > > > 
> > > > > > > You mean the "is there a signaled job in the list check"
> > > > > > > is needed
> > > > > > > for drm_sched_job_reinsert_on_false_timeout()? Hmm why?
> > > > > > > Worst case
> > > > > > > is a false positive wakeup on the free worker, no?
> > > > > > 
> > > > > > Correct me if I'm mistaken, we would also have a false
> > > > > > positive wake-up
> > > > > > on the run_job worker, which I believe it could be
> > > > > > problematic in the
> > > > > > cases that we skipped the reset because the job is still
> > > > > > running.
> > > > > 
> > > > > Run job worker exits when it sees no free credits so I don't
> > > > > think
> > > > > there is a problem. What am I missing?
> > > > > 
> > > > 
> > > > I was the one missing the code in `drm_sched_can_queue()`.
> > > > Sorry for the
> > > > misleading comments. This is:
> > > > 
> > > > Reviewed-by: Maíra Canal <mca...@igalia.com>
> > > 
> > > No worries, and thanks!
> > > 
> > > Philipp - are you okay with this version? V2 was done to address
> > > your
> > > feedback so that should be good now.
> > 
> > Was just giving it another spin when you wrote. (a [PATCH v3]
> > would've
> > been neat for identification, though – I almost pulled the wrong
> > patch
> > from the archive *wink*)
> 
> Oops, my bad.
> 
> > LGTM, improves things, can be merged.
> > 
> > However, we had to merge Lin Cao's bug fix [1] recently. That one
> > is
> > now in drm-misc-fixes, and your patch should go to drm-misc-next.
> > This
> > would cause a conflict once the two branches meet.
> > 
> > So I suggest that we wait with this non-urgent patch until drm-
> > misc-
> > fixes / Linus's -rc gets merged into drm-misc-next, and then we
> > apply
> > it. Should be next week or the week after AFAIK.
> > 
> > Unless somebody has a better idea, of course?
> 
> Lin's patch touches sched_entity.c only and mine only sched_main.c -
> ie. 
> no conflict AFAICT?

Aaahhh, I had a hallucination ^^'

It doesn't apply to drm-misc-fixes, but that is because fixes misses
changes that yours is based on. Because Lin's patch was the last thing
I touched on that branch I seem to have jumped to that conclusion.

Should be fine, then. My bad.

Will apply.


P.


> 
> Regards,
> 
> Tvrtko
> 
> > 
> > Remind me in case I forget.
> > 
> > 
> > P.
> > 
> > [1]
> > https://gitlab.freedesktop.org/drm/misc/kernel/-/commit/15f77764e90a713ee3916ca424757688e4f565b9
> > 
> > 
> > > 
> > > Regards,
> > > 
> > > Tvrtko
> > > 
> > 
> 

Reply via email to