ttm: call ttm_bo_cleanup_refs with reservation and lru lock held

Thomas Hellstrom Wed, 28 Nov 2012 20:21:17 +0100

On 11/28/2012 07:32 PM, Maarten Lankhorst wrote:
> Op 28-11-12 16:10, Thomas Hellstrom schreef:
>> On 11/28/2012 03:46 PM, Maarten Lankhorst wrote:
>>> Op 28-11-12 15:23, Thomas Hellstrom schreef:
>>>> On 11/28/2012 02:55 PM, Maarten Lankhorst wrote:
>>>>> Op 28-11-12 14:21, Thomas Hellstrom schreef:
>>>>>> On 11/28/2012 01:15 PM, Maarten Lankhorst wrote:
>>>>>>> Op 28-11-12 12:54, Thomas Hellstrom schreef:
>>>>>>>> On 11/28/2012 12:25 PM, Maarten Lankhorst wrote:
>>>>>>>>> By removing the unlocking of lru and retaking it immediately, a race 
>>>>>>>>> is
>>>>>>>>> removed where the bo is taken off the swap list or the lru list 
>>>>>>>>> between
>>>>>>>>> the unlock and relock. As such the cleanup_refs code can be 
>>>>>>>>> simplified,
>>>>>>>>> it will attempt to call ttm_bo_wait non-blockingly, and if it fails
>>>>>>>>> it will drop the locks and perform a blocking wait, or return an error
>>>>>>>>> if no_wait_gpu was set.
>>>>>>>>>
>>>>>>>>> The need for looping is also eliminated, since swapout and 
>>>>>>>>> evict_mem_first
>>>>>>>>> will always follow the destruction path, so no new fence is allowed
>>>>>>>>> to be attached. As far as I can see this may already have been the 
>>>>>>>>> case,
>>>>>>>>> but the unlocking / relocking required a complicated loop to deal with
>>>>>>>>> re-reservation.
>>>>>>>>>
>>>>>>>>> The downside is that ttm_bo_cleanup_memtype_use is no longer called 
>>>>>>>>> with
>>>>>>>>> reservation held, so drivers must be aware that move_notify with a 
>>>>>>>>> null
>>>>>>>>> parameter doesn't require a reservation.
>>>>>>>> Why can't we unreserve *after* ttm_bo_cleanup_memtype_use? That's not
>>>>>>>> immediately clear from this patch.
>>>>>>> Because we would hold the reservation while waiting and with the object 
>>>>>>> still
>>>>>>> on swap and lru lists still, that would defeat the whole purpose of 
>>>>>>> keeping
>>>>>>> the object on multiple lists, plus break current code that assumes bo's 
>>>>>>> on the
>>>>>>> those lists can always be reserved.
>>>>>>>
>>>>>>> the if (ret && !no_wait_gpu) path has to drop the reservation and lru 
>>>>>>> lock, and
>>>>>>> isn't guaranteed to be able to retake it. Maybe it could be guaranteed 
>>>>>>> now, but
>>>>>>> I'm sure that would not be the case if the reservations were shared 
>>>>>>> across
>>>>>>> devices.
>>>>>> The evict path removes the BO from the LRU lists, drops the LRU lock but 
>>>>>> hangs on to the reservation,
>>>>>> and in case the wait goes wrong, re-adds the bo to the LRU lists and 
>>>>>> returns an error.
>>>>> If you really want to, we could hang on to the !no_wait_gpu path, wait 
>>>>> shouldn't ever fail there, so I suppose
>>>>> leaving it off the lru lists and not re-add on any list in case of wait 
>>>>> fail is fine. It's still on the ddestroy list in that
>>>>> case, so not adding it back to the other lists is harmless.
>>>>>
>>>> Well I'm a bit afraid that theoretically, other callers may have a bo 
>>>> reserved, while cleanup_refs_and_unlock
>>>> more or less runs the whole destroy path on that buffer. Sure, we have 
>>>> control over those other reservers,
>>>> but it may come back and bite us.
>>> That's why initially I moved all the destruction to ttm_bo_release_list, to 
>>> have all destruction in
>>> only 1 place. But even now it's serialized with the lru lock, while the 
>>> destruction may not happen
>>> right away, it still happens before last list ref to the bo is dropped.
>>>
>>> But it's your call, just choose the approach you want and I'll resubmit 
>>> this. :-)
>>>
>>>> Also the wait might fail if a signal is hit, so it's definitely possible, 
>>>> and even likely in the case of the X server process.
>>>>
>>>> Anyway, I prefer if we could try to keep the reservation across the 
>>>> ttm_cleanup_memtype_use  function, and as far
>>>> as I can tell, the only thing preventing that is the reservation release 
>>>> in the (!no_wait_gpu) path. So if we alter that to
>>>> do the same as the evict path I think without looking to deeply into the 
>>>> consequences that we should be safe.
>>> I think returning success early without removing off ddestroy list if 
>>> re-reserving fails
>>> with lru lock held would be better.
>>>
>>> We completed the wait and attempt to reserve the bo, which failed. Without 
>>> the lru
>>> lock atomicity, this can't happen since the only places that would do it 
>>> call this with
>>> the lru lock held.
>>>
>>> With the atomicity removal, the only place that could do this is 
>>> ttm_bo_delayed_delete
>>> with remove_all set to true. And even if that happened the destruction code 
>>> would run
>>> *anyway* since we completed the waiting part already, it would just not 
>>> necessarily be
>>> run from this thread, but that guarantee didn't exist anyway.
>>>> Then we should be able to skip patch 2 as well.
>>> If my tryreserve approach sounds sane, second patch should still be 
>>> skippable. :-)
>> Sure, Lets go for that approach.
> Ok updated patch below,  you only need to check if you like the approach or 
> not,
> since I haven't tested it yet beyond compiling. Rest of series (minus patch 2)
> should still apply without modification.
>
>      drm/ttm: call ttm_bo_cleanup_refs with reservation and lru lock held, v2
>      
>      By removing the unlocking of lru and retaking it immediately, a race is
>      removed where the bo is taken off the swap list or the lru list between
>      the unlock and relock. As such the cleanup_refs code can be simplified,
>      it will attempt to call ttm_bo_wait non-blockingly, and if it fails
>      it will drop the locks and perform a blocking wait, or return an error
>      if no_wait_gpu was set.
>      
>      The need for looping is also eliminated, since swapout and 
> evict_mem_first
>      will always follow the destruction path, no new fence is allowed
>      to be attached. As far as I can see this may already have been the case,
>      but the unlocking / relocking required a complicated loop to deal with
>      re-reservation.
>      
>      Changes since v1:
>       - Simplify no_wait_gpu case by folding it in with empty ddestroy.
>       - Hold a reservation while calling ttm_bo_cleanup_memtype_use again.
>      
>      Signed-off-by: Maarten Lankhorst <maarten.lankhorst at canonical.com>
>
> diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
> index 202fc20..e9f01fe 100644
> --- a/drivers/gpu/drm/ttm/ttm_bo.c
> +++ b/drivers/gpu/drm/ttm/ttm_bo.c
> @@ -488,12 +488,16 @@ static void ttm_bo_cleanup_memtype_use(struct 
> ttm_buffer_object *bo)
>       ttm_bo_mem_put(bo, &bo->mem);
>   
>       atomic_set(&bo->reserved, 0);
> +     wake_up_all(&bo->event_queue);
>   
>       /*
> -      * Make processes trying to reserve really pick it up.
> +      * Since the final reference to this bo may not be dropped by
> +      * the current task we have to put a memory barrier here to make
> +      * sure the changes done in this function are always visible.
> +      *
> +      * This function only needs protection against the final kref_put.
>        */
> -     smp_mb__after_atomic_dec();
> -     wake_up_all(&bo->event_queue);
> +     smp_mb__before_atomic_dec();
>   }
>   
>   static void ttm_bo_cleanup_refs_or_queue(struct ttm_buffer_object *bo)
> @@ -543,68 +547,95 @@ static void ttm_bo_cleanup_refs_or_queue(struct 
> ttm_buffer_object *bo)
>   }
>   
>   /**
> - * function ttm_bo_cleanup_refs
> + * function ttm_bo_cleanup_refs_and_unlock
>    * If bo idle, remove from delayed- and lru lists, and unref.
>    * If not idle, do nothing.
>    *
> + * Must be called with lru_lock and reservation held, this function
> + * will drop both before returning.
> + *
>    * @interruptible         Any sleeps should occur interruptibly.
> - * @no_wait_reserve       Never wait for reserve. Return -EBUSY instead.
>    * @no_wait_gpu           Never wait for gpu. Return -EBUSY instead.
>    */
>   
> -static int ttm_bo_cleanup_refs(struct ttm_buffer_object *bo,
> -                            bool interruptible,
> -                            bool no_wait_reserve,
> -                            bool no_wait_gpu)
> +static int ttm_bo_cleanup_refs_and_unlock(struct ttm_buffer_object *bo,
> +                                       bool interruptible,
> +                                       bool no_wait_gpu)
>   {
>       struct ttm_bo_device *bdev = bo->bdev;
> +     struct ttm_bo_driver *driver = bdev->driver;
>       struct ttm_bo_global *glob = bo->glob;
>       int put_count;
> -     int ret = 0;
> +     int ret;
>   
> -retry:
>       spin_lock(&bdev->fence_lock);
> -     ret = ttm_bo_wait(bo, false, interruptible, no_wait_gpu);
> -     spin_unlock(&bdev->fence_lock);
> +     ret = ttm_bo_wait(bo, false, false, true);
>   
> -     if (unlikely(ret != 0))
> -             return ret;
> +     if (ret && !no_wait_gpu) {
> +             void *sync_obj;
>   
> -retry_reserve:
> -     spin_lock(&glob->lru_lock);
> +             /*
> +              * Take a reference to the fence and unreserve,
> +              * at this point the buffer should be dead, so
> +              * no new sync objects can be attached.
> +              */
> +             sync_obj = driver->sync_obj_ref(&bo->sync_obj);
> +             spin_unlock(&bdev->fence_lock);
>   
> -     if (unlikely(list_empty(&bo->ddestroy))) {
> +             put_count = ttm_bo_del_from_lru(bo);
> +
> +             atomic_set(&bo->reserved, 0);
> +             wake_up_all(&bo->event_queue);
>               spin_unlock(&glob->lru_lock);
> -             return 0;
> -     }
>   
> -     ret = ttm_bo_reserve_locked(bo, false, true, false, 0);
> +             ttm_bo_list_ref_sub(bo, put_count, true);
>   
> -     if (unlikely(ret == -EBUSY)) {
> -             spin_unlock(&glob->lru_lock);
> -             if (likely(!no_wait_reserve))
> -                     ret = ttm_bo_wait_unreserved(bo, interruptible);
> -             if (unlikely(ret != 0))
> +             ret = driver->sync_obj_wait(sync_obj, false, interruptible);
> +             driver->sync_obj_unref(&sync_obj);
> +             if (ret) {
> +                     /*
> +                      * Either the wait returned -ERESTARTSYS, or -EDEADLK
> +                      * (radeon lockup) here. No effort is made to re-add
> +                      * this bo to any lru list. Instead the bo only
> +                      * appears on the delayed destroy list.
> +                      */
>                       return ret;
> +             }
Actually, we *must* re-add the bo to LRU lists here, because otherwise 
when a driver needs
to evict a memory type completely, there's a large chance it will miss 
this bo.


So I think either we need to keep the reservation, or keep the bo on the 
LRU lists.

/Thomas

[PATCH 3/6] drm/ttm: call ttm_bo_cleanup_refs with reservation and lru lock held

Reply via email to