On 2017-06-12 07:00 PM, Christian König wrote:
Am 12.06.2017 um 22:31 schrieb Alex Xie:
This patch is to move a loop of unref BOs and
several memory free function calls out of
critical sections.

Signed-off-by: Alex Xie <[email protected]>
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c | 5 ++++-
  1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c
index a664987..02c138f 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c
@@ -75,9 +75,12 @@ static void amdgpu_bo_list_destroy(struct amdgpu_fpriv *fpriv, int id)
          /* Another user may have a reference to this list still */
          mutex_lock(&list->lock);
          mutex_unlock(&list->lock);
+        mutex_unlock(&fpriv->bo_list_lock);
          amdgpu_bo_list_free(list);
      }
-    mutex_unlock(&fpriv->bo_list_lock);
+    else {
+        mutex_unlock(&fpriv->bo_list_lock);
+    }

You could move the unlock of bo_list_lock even before the if.

But since you pointed it out there is quite a bug in this function:
/* Another user may have a reference to this list still */
                mutex_lock(&list->lock);
                mutex_unlock(&list->lock);
Not sure if that is still up to date, but that use case used to be illegal with mutexes.
As I understand this piece of code, these mutex_lock and mutex_unlock pair are used to make sure all other tasks have finished access of the bo list. Another side of this story is in function amdgpu_bo_list_get. These two piece of codes together make sure
we can safely destroy bo list.

Otherwise we can easily simplify these lockings.
Let me give an example here.

In function amdgpu_bo_list_get, assuming we change the code like this:
...
    down_read(&fpriv->bo_list_lock);
    result = idr_find(&fpriv->bo_list_handles, id);
    up_read(&fpriv->bo_list_lock);
    /**Line 1. Task A was scheduled away from CPU**/
    if (result)
        mutex_lock(&result->lock);
...

In function amdgpu_bo_list_destroy, assuming we change the code like this:
...
    down_write(&fpriv->bo_list_lock);
    list = idr_remove(&fpriv->bo_list_handles, id);
    up_write(&fpriv->bo_list_lock);
    if (list) {
        /* Another user may have a reference to this list still */
        mutex_lock(&list->lock);
        mutex_unlock(&list->lock);
        amdgpu_bo_list_free(list);
    }
}

When task A is running in function amdgpu_bo_list_get in line 1, CPU scheduler takes CPU away from task A. Then Task B run function amdgpu_bo_list_destroy. Task B can run all the way to destroy and free mutex. Later Task A is back to run. The mutex result->lock has been destroyed by task B. Now task A try to lock a mutex
which has been destroyed and freed.



Christian.

  }
    static int amdgpu_bo_list_set(struct amdgpu_device *adev,



_______________________________________________
amd-gfx mailing list
[email protected]
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Reply via email to