Am 16.10.23 um 14:15 schrieb Karolina Stolarek:
With the current cleanup flow, we could trigger a NULL pointer
dereference if there is a delayed destruction of a BO with a
system resource that gets executed on drain_workqueue() call,
as we attempt to free a resource using an already released
resource manager.

Remove the device from the device list and drain its workqueue
before releasing the system domain manager in ttm_device_fini().

Signed-off-by: Karolina Stolarek <karolina.stola...@intel.com>

Reviewed and pushed to drm-misc-fixes.

Thanks,
Christian

---
This is actually a reiteration of a patch sent in [1], but the
solution and commit message changed significantly, so I decided
not to send it as v2.
[1] - 20231013143423.1503088-1-karolina.stola...@intel.com

  drivers/gpu/drm/ttm/ttm_device.c | 8 ++++----
  1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/ttm/ttm_device.c b/drivers/gpu/drm/ttm/ttm_device.c
index 7726a72befc5..d48b39132b32 100644
--- a/drivers/gpu/drm/ttm/ttm_device.c
+++ b/drivers/gpu/drm/ttm/ttm_device.c
@@ -232,10 +232,6 @@ void ttm_device_fini(struct ttm_device *bdev)
        struct ttm_resource_manager *man;
        unsigned i;
- man = ttm_manager_type(bdev, TTM_PL_SYSTEM);
-       ttm_resource_manager_set_used(man, false);
-       ttm_set_driver_manager(bdev, TTM_PL_SYSTEM, NULL);
-
        mutex_lock(&ttm_global_mutex);
        list_del(&bdev->device_list);
        mutex_unlock(&ttm_global_mutex);
@@ -243,6 +239,10 @@ void ttm_device_fini(struct ttm_device *bdev)
        drain_workqueue(bdev->wq);
        destroy_workqueue(bdev->wq);
+ man = ttm_manager_type(bdev, TTM_PL_SYSTEM);
+       ttm_resource_manager_set_used(man, false);
+       ttm_set_driver_manager(bdev, TTM_PL_SYSTEM, NULL);
+
        spin_lock(&bdev->lru_lock);
        for (i = 0; i < TTM_MAX_BO_PRIORITY; ++i)
                if (list_empty(&man->lru[0]))

Reply via email to