Hi Tvrtko,

On 16/07/25 11:13, Tvrtko Ursulin wrote:

On 16/07/2025 15:05, Maíra Canal wrote:
Hi Tvrtko,

On 16/07/25 10:41, Tvrtko Ursulin wrote:

On 16/07/2025 13:47, Maíra Canal wrote:
Hi Tvrtko,

On 16/07/25 05:48, Tvrtko Ursulin wrote:
The "skip reset" test waits for the timeout handler to run for the
duration of 2 * MOCK_TIMEOUT, and because the mock scheduler opted to

Would it make any sense to wait for 1.5 * MOCK_TIMEOUT? This way we
would guarantee that only one timeout happened. I'm fine with the
current solution as well.

1.5 * MOCK_TIMEOUT would work as well. I considered it, and even though I thought it would be safe, I concluded that it is better to have fewer dependencies on timings given these are two threads in this story.

Why not both? Just to make sure we won't run the timedout function
twice, but still fixing the timing dependency by using
DRM_MOCK_SCHED_JOB_RESET_SKIPPED.

Just because I think probably does not guarantee timedout worker does not run twice. It still in _theory_ could if the unit test thread would be starved by some other system activity. In practice however it does work, because nothing much is running in parallel to the unit test, for vast majority of use cases. So I thought someone would be bound to complain if I just did the 1.5x approach. If no one objects I can add that tweak, no problem.

Fair, I'm okay in keeping it as it is.

Best Regards,
- Maíra


Making the DRM_MOCK_SCHED_JOB_DONT_RESET persist allows for not having to think about timings. So slight preference to that. At least until some more advanced tests are attempted to be added.

remove the "skip reset" flag once it fires, this gives opportunity for the timeout handler to run twice. Second time the job will be removed from the mock scheduler job list and the drm_mock_sched_advance() call in the test
will fail.

Fix it by making the "don't reset" flag persist for the lifetime of the
job and add a new flag to verify that the code path had executed as
expected.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursu...@igalia.com>
Fixes: 1472e7549f84 ("drm/sched: Add new test for DRM_GPU_SCHED_STAT_NO_HANG")
 > Cc: Maíra Canal <mca...@igalia.com>> Cc: Philipp Stanner <pha...@kernel.org>
---
  drivers/gpu/drm/scheduler/tests/mock_scheduler.c | 2 +-
  drivers/gpu/drm/scheduler/tests/sched_tests.h    | 7 ++++---
  drivers/gpu/drm/scheduler/tests/tests_basic.c    | 4 ++--
  3 files changed, 7 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/scheduler/tests/mock_scheduler.c b/ drivers/gpu/drm/scheduler/tests/mock_scheduler.c
index 65acffc3fea8..8e9ae7d980eb 100644
--- a/drivers/gpu/drm/scheduler/tests/mock_scheduler.c
+++ b/drivers/gpu/drm/scheduler/tests/mock_scheduler.c
@@ -219,7 +219,7 @@ mock_sched_timedout_job(struct drm_sched_job *sched_job)
      unsigned long flags;
      if (job->flags & DRM_MOCK_SCHED_JOB_DONT_RESET) {
-        job->flags &= ~DRM_MOCK_SCHED_JOB_DONT_RESET;
+        job->flags |= DRM_MOCK_SCHED_JOB_RESET_SKIPPED;
          return DRM_GPU_SCHED_STAT_NO_HANG;
      }
diff --git a/drivers/gpu/drm/scheduler/tests/sched_tests.h b/ drivers/ gpu/drm/scheduler/tests/sched_tests.h
index 63d4f2ac7074..5b262126b776 100644
--- a/drivers/gpu/drm/scheduler/tests/sched_tests.h
+++ b/drivers/gpu/drm/scheduler/tests/sched_tests.h
@@ -95,9 +95,10 @@ struct drm_mock_sched_job {
      struct completion    done;
-#define DRM_MOCK_SCHED_JOB_DONE        0x1
-#define DRM_MOCK_SCHED_JOB_TIMEDOUT    0x2
-#define DRM_MOCK_SCHED_JOB_DONT_RESET    0x4
+#define DRM_MOCK_SCHED_JOB_DONE            0x1
+#define DRM_MOCK_SCHED_JOB_TIMEDOUT        0x2
+#define DRM_MOCK_SCHED_JOB_DONT_RESET        0x4
+#define DRM_MOCK_SCHED_JOB_RESET_SKIPPED    0x8
      unsigned long        flags;
      struct list_head    link;
diff --git a/drivers/gpu/drm/scheduler/tests/tests_basic.c b/ drivers/ gpu/drm/scheduler/tests/tests_basic.c
index 55eb142bd7c5..82a41a456b0a 100644
--- a/drivers/gpu/drm/scheduler/tests/tests_basic.c
+++ b/drivers/gpu/drm/scheduler/tests/tests_basic.c
@@ -317,8 +317,8 @@ static void drm_sched_skip_reset(struct kunit *test)
      KUNIT_ASSERT_FALSE(test, done);
      KUNIT_ASSERT_EQ(test,
-            job->flags & DRM_MOCK_SCHED_JOB_DONT_RESET,
-            0);
+            job->flags & DRM_MOCK_SCHED_JOB_RESET_SKIPPED,
+            DRM_MOCK_SCHED_JOB_RESET_SKIPPED);

Maybe we could assert that job->flags & DRM_MOCK_SCHED_JOB_TIMEDOUT == 0

Could but I am not sure it is needed.

Np.


Anyway, thanks for the fix!

Reviewed-by: Maíra Canal <mca...@igalia.com>

Thank you!

Btw the failure for the record:

[09:07:20] # drm_sched_skip_reset: ASSERTION FAILED at drivers/gpu/ drm/ scheduler/tests/tests_basic.c:324
[09:07:20] Expected i == 1, but
[09:07:20]     i == 0 (0x0)
[09:07:20] [FAILED] drm_sched_skip_reset
[09:07:20]     # module: drm_sched_tests
[09:07:20] # drm_sched_basic_timeout_tests: pass:1 fail:1 skip:0 total:2
[09:07:20] # Totals: pass:1 fail:1 skip:0 total:2

Unfortunately, I didn't get this error during my test run. On the other
hand, I only ran it once before pushing the series, so that's on me.
Thanks for catching it!

No worries.

Regards,

Tvrtko

      i = drm_mock_sched_advance(sched, 1);
      KUNIT_ASSERT_EQ(test, i, 1);





Reply via email to