On 12/05/2025 13:56, Philipp Stanner wrote:
On Fri, 2025-04-25 at 11:20 +0100, Tvrtko Ursulin wrote:
To implement fair scheduling we will need as accurate as possible
view
into per entity GPU time utilisation. Because sched fence execution
time
are only adjusted for accuracy in the free worker we need to process
completed jobs as soon as possible so the metric is most up to date
when
view from the submission side of things.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursu...@igalia.com>
Cc: Christian König <christian.koe...@amd.com>
Cc: Danilo Krummrich <d...@kernel.org>
Cc: Matthew Brost <matthew.br...@intel.com>
Cc: Philipp Stanner <pha...@kernel.org>
---
  drivers/gpu/drm/scheduler/sched_main.c | 15 ++-------------
  1 file changed, 2 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/scheduler/sched_main.c
b/drivers/gpu/drm/scheduler/sched_main.c
index 8950c7705f57..22428a1569dd 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -865,13 +865,12 @@ drm_sched_select_entity(struct
drm_gpu_scheduler *sched)
   * drm_sched_get_finished_job - fetch the next finished job to be
destroyed
   *
   * @sched: scheduler instance
- * @have_more: are there more finished jobs on the list
   *
   * Returns the next finished job from the pending list (if there is
one)
   * ready for it to be destroyed.
   */
  static struct drm_sched_job *
-drm_sched_get_finished_job(struct drm_gpu_scheduler *sched, bool
*have_more)
+drm_sched_get_finished_job(struct drm_gpu_scheduler *sched)
  {
        struct drm_sched_job *job, *next;
@@ -886,7 +885,6 @@ drm_sched_get_finished_job(struct
drm_gpu_scheduler *sched, bool *have_more)
                /* cancel this job's TO timer */
                cancel_delayed_work(&sched->work_tdr);
- *have_more = false;
                next = list_first_entry_or_null(&sched-
pending_list,
                                                typeof(*next),
list);
                if (next) {
@@ -896,10 +894,6 @@ drm_sched_get_finished_job(struct
drm_gpu_scheduler *sched, bool *have_more)
                                next->s_fence->scheduled.timestamp =
                                        dma_fence_timestamp(&job-
s_fence->finished);
- if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT,
-                                    &next->s_fence-
finished.flags))
-                               *have_more = true;
-
                        /* start TO timer for next job */
                        drm_sched_start_timeout(sched);
                }
@@ -958,14 +952,9 @@ static void drm_sched_free_job_work(struct
work_struct *w)
        struct drm_gpu_scheduler *sched =
                container_of(w, struct drm_gpu_scheduler,
work_free_job);
        struct drm_sched_job *job;
-       bool have_more;
- job = drm_sched_get_finished_job(sched, &have_more);
-       if (job) {
+       while ((job = drm_sched_get_finished_job(sched)))
                sched->ops->free_job(job);
-               if (have_more)
-                       __drm_sched_run_free_queue(sched);
-       }

Are there any have_more users left after that?

Removing here what was added before IMO makes it more questionable
adding that improvement in the first place.

Yep, it is definitely not typical to add and then remove stuff in the same series. Reason is series was not intended (or expected) to get accepted as one. I was expecting easy cleanups to get in fast and the rest to keep iterating for who knows how long.

Regards,

Tvrtko

Reply via email to