On 07.07.25 18:25, Matthew Brost wrote: > On Mon, Jul 07, 2025 at 02:38:07PM +0200, Christian König wrote: >> On 03.07.25 00:01, Matthew Brost wrote: >>>> diff --git a/drivers/gpu/drm/ttm/tests/ttm_bo_test.c >>>> b/drivers/gpu/drm/ttm/tests/ttm_bo_test.c >>>> index 6c77550c51af..5426b435f702 100644 >>>> --- a/drivers/gpu/drm/ttm/tests/ttm_bo_test.c >>>> +++ b/drivers/gpu/drm/ttm/tests/ttm_bo_test.c >>>> @@ -379,7 +379,7 @@ static void ttm_bo_unreserve_bulk(struct kunit *test) >>>> dma_resv_fini(resv); >>>> } >>>> >>>> -static void ttm_bo_put_basic(struct kunit *test) >>>> +static void ttm_bo_fini_basic(struct kunit *test) >>>> { >>>> struct ttm_test_devices *priv = test->priv; >>>> struct ttm_buffer_object *bo; >>>> @@ -410,7 +410,7 @@ static void ttm_bo_put_basic(struct kunit *test) >>>> dma_resv_unlock(bo->base.resv); >>>> KUNIT_EXPECT_EQ(test, err, 0); >>>> >>>> - ttm_bo_put(bo); >>>> + ttm_bo_fini(bo); >>> >>> Intel's CI [1], see Kunit tab, is indicating an issue with the >>> selftests. >> >> Even without any change the ttm_bo_validate subtest is crashing for me and I >> was about to disable those crashing tests. >> >> My guess is that the test never worked 100% reliable and relies on some >> incorrect assumptions. >> > > Hmm, this seems to work in our CI pretty reliably but in general I am > not a fan of selftests, particularly ones so fragile that any small > change of behavior breaks the tests. If this is indeed one of cases > (testing really specific behavior), fine with disabling it.
The ttm_bo_validate_test is crashing 100% reliable on my build box. Skimming over the code I've found at least one incorrect use of locks, but that doesn't seem to fix it. Going to take a closer look tomorrow. Regards, Christian. > >>> Unsure if this suggestion would fix the kunit failure, but >>> would it not be better to just ref count gem BOs in the kunit tests and >>> create a mock drm_gem_object_funcs ops in in which free calls >>> ttm_bo_fini? Then in selftests replace ttm_bo_fini with >>> drm_gem_object_put? >> >> Yeah that is one possible solution I had in mind as well, but I thought >> about disabling the failed test first and then discussion with Thomas what >> to do about it. >> > > See above. Yea it Intel's main (IGTs) CI work, I'd say there is about > 99% confidence that the changes you are making haven't broke anything. > > Matt > >> Christian. >> >>> >>> Matt