On 11/28/2012 07:32 PM, Maarten Lankhorst wrote: > Op 28-11-12 16:10, Thomas Hellstrom schreef: >> On 11/28/2012 03:46 PM, Maarten Lankhorst wrote: >>> Op 28-11-12 15:23, Thomas Hellstrom schreef: >>>> On 11/28/2012 02:55 PM, Maarten Lankhorst wrote: >>>>> Op 28-11-12 14:21, Thomas Hellstrom schreef: >>>>>> On 11/28/2012 01:15 PM, Maarten Lankhorst wrote: >>>>>>> Op 28-11-12 12:54, Thomas Hellstrom schreef: >>>>>>>> On 11/28/2012 12:25 PM, Maarten Lankhorst wrote: >>>>>>>>> By removing the unlocking of lru and retaking it immediately, a race >>>>>>>>> is >>>>>>>>> removed where the bo is taken off the swap list or the lru list >>>>>>>>> between >>>>>>>>> the unlock and relock. As such the cleanup_refs code can be >>>>>>>>> simplified, >>>>>>>>> it will attempt to call ttm_bo_wait non-blockingly, and if it fails >>>>>>>>> it will drop the locks and perform a blocking wait, or return an error >>>>>>>>> if no_wait_gpu was set. >>>>>>>>> >>>>>>>>> The need for looping is also eliminated, since swapout and >>>>>>>>> evict_mem_first >>>>>>>>> will always follow the destruction path, so no new fence is allowed >>>>>>>>> to be attached. As far as I can see this may already have been the >>>>>>>>> case, >>>>>>>>> but the unlocking / relocking required a complicated loop to deal with >>>>>>>>> re-reservation. >>>>>>>>> >>>>>>>>> The downside is that ttm_bo_cleanup_memtype_use is no longer called >>>>>>>>> with >>>>>>>>> reservation held, so drivers must be aware that move_notify with a >>>>>>>>> null >>>>>>>>> parameter doesn't require a reservation. >>>>>>>> Why can't we unreserve *after* ttm_bo_cleanup_memtype_use? That's not >>>>>>>> immediately clear from this patch. >>>>>>> Because we would hold the reservation while waiting and with the object >>>>>>> still >>>>>>> on swap and lru lists still, that would defeat the whole purpose of >>>>>>> keeping >>>>>>> the object on multiple lists, plus break current code that assumes bo's >>>>>>> on the >>>>>>> those lists can always be reserved. >>>>>>> >>>>>>> the if (ret && !no_wait_gpu) path has to drop the reservation and lru >>>>>>> lock, and >>>>>>> isn't guaranteed to be able to retake it. Maybe it could be guaranteed >>>>>>> now, but >>>>>>> I'm sure that would not be the case if the reservations were shared >>>>>>> across >>>>>>> devices. >>>>>> The evict path removes the BO from the LRU lists, drops the LRU lock but >>>>>> hangs on to the reservation, >>>>>> and in case the wait goes wrong, re-adds the bo to the LRU lists and >>>>>> returns an error. >>>>> If you really want to, we could hang on to the !no_wait_gpu path, wait >>>>> shouldn't ever fail there, so I suppose >>>>> leaving it off the lru lists and not re-add on any list in case of wait >>>>> fail is fine. It's still on the ddestroy list in that >>>>> case, so not adding it back to the other lists is harmless. >>>>> >>>> Well I'm a bit afraid that theoretically, other callers may have a bo >>>> reserved, while cleanup_refs_and_unlock >>>> more or less runs the whole destroy path on that buffer. Sure, we have >>>> control over those other reservers, >>>> but it may come back and bite us. >>> That's why initially I moved all the destruction to ttm_bo_release_list, to >>> have all destruction in >>> only 1 place. But even now it's serialized with the lru lock, while the >>> destruction may not happen >>> right away, it still happens before last list ref to the bo is dropped. >>> >>> But it's your call, just choose the approach you want and I'll resubmit >>> this. :-) >>> >>>> Also the wait might fail if a signal is hit, so it's definitely possible, >>>> and even likely in the case of the X server process. >>>> >>>> Anyway, I prefer if we could try to keep the reservation across the >>>> ttm_cleanup_memtype_use function, and as far >>>> as I can tell, the only thing preventing that is the reservation release >>>> in the (!no_wait_gpu) path. So if we alter that to >>>> do the same as the evict path I think without looking to deeply into the >>>> consequences that we should be safe. >>> I think returning success early without removing off ddestroy list if >>> re-reserving fails >>> with lru lock held would be better. >>> >>> We completed the wait and attempt to reserve the bo, which failed. Without >>> the lru >>> lock atomicity, this can't happen since the only places that would do it >>> call this with >>> the lru lock held. >>> >>> With the atomicity removal, the only place that could do this is >>> ttm_bo_delayed_delete >>> with remove_all set to true. And even if that happened the destruction code >>> would run >>> *anyway* since we completed the waiting part already, it would just not >>> necessarily be >>> run from this thread, but that guarantee didn't exist anyway. >>>> Then we should be able to skip patch 2 as well. >>> If my tryreserve approach sounds sane, second patch should still be >>> skippable. :-) >> Sure, Lets go for that approach. > Ok updated patch below, you only need to check if you like the approach or > not, > since I haven't tested it yet beyond compiling. Rest of series (minus patch 2) > should still apply without modification. > > drm/ttm: call ttm_bo_cleanup_refs with reservation and lru lock held, v2 > > By removing the unlocking of lru and retaking it immediately, a race is > removed where the bo is taken off the swap list or the lru list between > the unlock and relock. As such the cleanup_refs code can be simplified, > it will attempt to call ttm_bo_wait non-blockingly, and if it fails > it will drop the locks and perform a blocking wait, or return an error > if no_wait_gpu was set. > > The need for looping is also eliminated, since swapout and > evict_mem_first > will always follow the destruction path, no new fence is allowed > to be attached. As far as I can see this may already have been the case, > but the unlocking / relocking required a complicated loop to deal with > re-reservation. > > Changes since v1: > - Simplify no_wait_gpu case by folding it in with empty ddestroy. > - Hold a reservation while calling ttm_bo_cleanup_memtype_use again. > > Signed-off-by: Maarten Lankhorst <maarten.lankhorst at canonical.com> > > diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c > index 202fc20..e9f01fe 100644 > --- a/drivers/gpu/drm/ttm/ttm_bo.c > +++ b/drivers/gpu/drm/ttm/ttm_bo.c > @@ -488,12 +488,16 @@ static void ttm_bo_cleanup_memtype_use(struct > ttm_buffer_object *bo) > ttm_bo_mem_put(bo, &bo->mem); > > atomic_set(&bo->reserved, 0); > + wake_up_all(&bo->event_queue); > > /* > - * Make processes trying to reserve really pick it up. > + * Since the final reference to this bo may not be dropped by > + * the current task we have to put a memory barrier here to make > + * sure the changes done in this function are always visible. > + * > + * This function only needs protection against the final kref_put. > */ > - smp_mb__after_atomic_dec(); > - wake_up_all(&bo->event_queue); > + smp_mb__before_atomic_dec(); > } > > static void ttm_bo_cleanup_refs_or_queue(struct ttm_buffer_object *bo) > @@ -543,68 +547,95 @@ static void ttm_bo_cleanup_refs_or_queue(struct > ttm_buffer_object *bo) > } > > /** > - * function ttm_bo_cleanup_refs > + * function ttm_bo_cleanup_refs_and_unlock > * If bo idle, remove from delayed- and lru lists, and unref. > * If not idle, do nothing. > * > + * Must be called with lru_lock and reservation held, this function > + * will drop both before returning. > + * > * @interruptible Any sleeps should occur interruptibly. > - * @no_wait_reserve Never wait for reserve. Return -EBUSY instead. > * @no_wait_gpu Never wait for gpu. Return -EBUSY instead. > */ > > -static int ttm_bo_cleanup_refs(struct ttm_buffer_object *bo, > - bool interruptible, > - bool no_wait_reserve, > - bool no_wait_gpu) > +static int ttm_bo_cleanup_refs_and_unlock(struct ttm_buffer_object *bo, > + bool interruptible, > + bool no_wait_gpu) > { > struct ttm_bo_device *bdev = bo->bdev; > + struct ttm_bo_driver *driver = bdev->driver; > struct ttm_bo_global *glob = bo->glob; > int put_count; > - int ret = 0; > + int ret; > > -retry: > spin_lock(&bdev->fence_lock); > - ret = ttm_bo_wait(bo, false, interruptible, no_wait_gpu); > - spin_unlock(&bdev->fence_lock); > + ret = ttm_bo_wait(bo, false, false, true); > > - if (unlikely(ret != 0)) > - return ret; > + if (ret && !no_wait_gpu) { > + void *sync_obj; > > -retry_reserve: > - spin_lock(&glob->lru_lock); > + /* > + * Take a reference to the fence and unreserve, > + * at this point the buffer should be dead, so > + * no new sync objects can be attached. > + */ > + sync_obj = driver->sync_obj_ref(&bo->sync_obj); > + spin_unlock(&bdev->fence_lock); > > - if (unlikely(list_empty(&bo->ddestroy))) { > + put_count = ttm_bo_del_from_lru(bo); > + > + atomic_set(&bo->reserved, 0); > + wake_up_all(&bo->event_queue); > spin_unlock(&glob->lru_lock); > - return 0; > - } > > - ret = ttm_bo_reserve_locked(bo, false, true, false, 0); > + ttm_bo_list_ref_sub(bo, put_count, true); > > - if (unlikely(ret == -EBUSY)) { > - spin_unlock(&glob->lru_lock); > - if (likely(!no_wait_reserve)) > - ret = ttm_bo_wait_unreserved(bo, interruptible); > - if (unlikely(ret != 0)) > + ret = driver->sync_obj_wait(sync_obj, false, interruptible); > + driver->sync_obj_unref(&sync_obj); > + if (ret) { > + /* > + * Either the wait returned -ERESTARTSYS, or -EDEADLK > + * (radeon lockup) here. No effort is made to re-add > + * this bo to any lru list. Instead the bo only > + * appears on the delayed destroy list. > + */ > return ret; > + } Actually, we *must* re-add the bo to LRU lists here, because otherwise when a driver needs to evict a memory type completely, there's a large chance it will miss this bo.
So I think either we need to keep the reservation, or keep the bo on the LRU lists. /Thomas