On 06/11/2024 13:34, Boris Brezillon wrote:
> On Wed, 6 Nov 2024 13:17:29 +0000
> Steven Price <steven.pr...@arm.com> wrote:
> 
>> On 06/11/2024 12:07, Liviu Dudau wrote:
>>> Similar to cac075706f29 ("drm/panthor: Fix race when converting
>>> group handle to group object") we need to use the XArray's internal
>>> locking when retrieving a pointer from there for heap and vm.
>>>
>>> Reported-by: Jann Horn <ja...@google.com>
>>> Cc: Boris Brezillon <boris.brezil...@collabora.com>
>>> Cc: Steven Price <steven.pr...@arm.com>
>>> Signed-off-by: Liviu Dudau <liviu.du...@arm.com>
>>> ---
>>>  drivers/gpu/drm/panthor/panthor_heap.c | 15 +++++++++++++--
>>>  drivers/gpu/drm/panthor/panthor_mmu.c  |  2 ++
>>>  2 files changed, 15 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/panthor/panthor_heap.c 
>>> b/drivers/gpu/drm/panthor/panthor_heap.c
>>> index 3796a9eb22af2..fe0bcb6837f74 100644
>>> --- a/drivers/gpu/drm/panthor/panthor_heap.c
>>> +++ b/drivers/gpu/drm/panthor/panthor_heap.c
>>> @@ -351,6 +351,17 @@ int panthor_heap_create(struct panthor_heap_pool *pool,
>>>     return ret;
>>>  }
>>>  
>>> +static struct panthor_heap *panthor_heap_from_id(struct pathor_heap_pool 
>>> *pool, u32 id)
>>> +{
>>> +   struct panthor_heap *heap;
>>> +
>>> +   xa_lock(&pool->xa);
>>> +   heap = xa_load(&pool->xa, id);
>>> +   xa_unlock(&pool->va);
>>> +
>>> +   return heap;
>>> +}  
>>
>> This locking doesn't actually achieve anything - XArray already deals
>> with the concurrency (with RCU), and if we're doing nothing more than an
>> xa_load() then we don't need (extra) locking (unless using the __
>> prefixed functions).
>>
>> And, as Boris has pointed out, pool->lock is held. As you mention in
>> your email the missing bit might be panthor_heap_pool_release() - if
>> it's not holding a lock then the heap could be freed immediately after
>> panthor_heap_from_id() returns (even with the above change).
> 
> Hm, if we call panthor_heap_from_id(), that means we have a heap pool to
> pass, and incidentally, we're supposed to hold a ref on this pool. So I
> don't really see how the heap pool can go away, unless someone messed
> up with the refcounting in the meantime.

No I'm not sure how it could happen... ;) Hence the "might" - I'd
assumed Liviu had a good reason for thinking there might be a
race/missing ref.

Really it's panthor_heap_destroy_locked() that we need to consider
racing with - as that can remove (and free) an entry from the XArray. It
might be worth putting an lockdep annotation in there to check that the
lock is indeed held. But the code currently looks correct.

Steve

>>
>> Steve
>>
>>> +
>>>  /**
>>>   * panthor_heap_return_chunk() - Return an unused heap chunk
>>>   * @pool: The pool this heap belongs to.
>>> @@ -375,7 +386,7 @@ int panthor_heap_return_chunk(struct panthor_heap_pool 
>>> *pool,
>>>             return -EINVAL;
>>>  
>>>     down_read(&pool->lock);
>>> -   heap = xa_load(&pool->xa, heap_id);
>>> +   heap = panthor_heap_from_id(pool, heap_id);
>>>     if (!heap) {
>>>             ret = -EINVAL;
>>>             goto out_unlock;
>>> @@ -438,7 +449,7 @@ int panthor_heap_grow(struct panthor_heap_pool *pool,
>>>             return -EINVAL;
>>>  
>>>     down_read(&pool->lock);
>>> -   heap = xa_load(&pool->xa, heap_id);
>>> +   heap = panthor_heap_from_id(pool, heap_id);
>>>     if (!heap) {
>>>             ret = -EINVAL;
>>>             goto out_unlock;
>>> diff --git a/drivers/gpu/drm/panthor/panthor_mmu.c 
>>> b/drivers/gpu/drm/panthor/panthor_mmu.c
>>> index 8ca85526491e6..8b5cda9d21768 100644
>>> --- a/drivers/gpu/drm/panthor/panthor_mmu.c
>>> +++ b/drivers/gpu/drm/panthor/panthor_mmu.c
>>> @@ -1580,7 +1580,9 @@ panthor_vm_pool_get_vm(struct panthor_vm_pool *pool, 
>>> u32 handle)
>>>  {
>>>     struct panthor_vm *vm;
>>>  
>>> +   xa_lock(&pool->xa);
>>>     vm = panthor_vm_get(xa_load(&pool->xa, handle));
>>> +   xa_unlock(&pool->va);
>>>  
>>>     return vm;
>>>  }  
>>
> 

Reply via email to