ttm: change fence_lock to inner lock, v3

Maarten Lankhorst Thu, 22 Nov 2012 07:52:29 -0800

Op 21-11-12 14:27, Thomas Hellstrom schreef:
> On 11/21/2012 02:12 PM, Maarten Lankhorst wrote:
>> Op 21-11-12 13:42, Thomas Hellstrom schreef:
>>> On 11/21/2012 12:38 PM, Maarten Lankhorst wrote:
>>>> Hey,
>>>>
>>>> Op 20-11-12 16:08, Thomas Hellstrom schreef:
>>>>> On 11/20/2012 02:13 PM, Maarten Lankhorst wrote:
>>>>>> Op 20-11-12 13:03, Thomas Hellstrom schreef:
>>>>>>> On 11/20/2012 12:33 PM, Maarten Lankhorst wrote:
>>>>>>>> Op 20-11-12 08:48, Thomas Hellstrom schreef:
>>>>>>>>> On 11/19/2012 04:33 PM, Maarten Lankhorst wrote:
>>>>>>>>>> Op 19-11-12 16:04, Thomas Hellstrom schreef:
>>>>>>>>>>> On 11/19/2012 03:17 PM, Thomas Hellstrom wrote:
>>>>>>>>>>>> Hi,
>>>>>>>>>>>>
>>>>>>>>>>>> This patch looks mostly good, although I think ttm_bo_cleanup_refs 
>>>>>>>>>>>> becomes overly complicated:
>>>>>>>>>>>> Could this do, or am I missing something?
>>>>>>>>>>>>
>>>>>>>>>>> Actually, my version is bad, because ttm_bo_wait() is called with 
>>>>>>>>>>> the lru lock held.
>>>>>>>>>>>
>>>>>>>>>>> /Thomas
>>>>>>>>>> Oh digging through it made me remember why I had to release the 
>>>>>>>>>> reservation early and
>>>>>>>>>> had to allow move_notify to be called without reservation.
>>>>>>>>>>
>>>>>>>>>> Fortunately move_notify has a NULL parameter, which is the only time 
>>>>>>>>>> that happens,
>>>>>>>>>> so you can still check do BUG_ON(mem != NULL && 
>>>>>>>>>> !ttm_bo_reserved(bo)); in your
>>>>>>>>>> move_notify handler.
>>>>>>>>>>
>>>>>>>>>> 05/10 removed the loop and assumed no new fence could be attached 
>>>>>>>>>> after the driver has
>>>>>>>>>> declared the bo dead.
>>>>>>>>>>
>>>>>>>>>> However, at that point it may no longer hold a reservation to 
>>>>>>>>>> confirm this, that's why
>>>>>>>>>> I moved the cleanup to be done in the release_list handler. It could 
>>>>>>>>>> still be done in
>>>>>>>>>> ttm_bo_release, but we no longer have a reservation after we waited. 
>>>>>>>>>> Getting
>>>>>>>>>> a reservation can fail if the bo is imported for example.
>>>>>>>>>>
>>>>>>>>>> While it would be true that in that case a new fence may be attached 
>>>>>>>>>> as well, that
>>>>>>>>>> would be less harmful since that operation wouldn't involve this 
>>>>>>>>>> device, so the
>>>>>>>>>> ttm bo can still be removed in that case. When that time comes I 
>>>>>>>>>> should probably
>>>>>>>>>> fix up that WARN_ON(ret) in ttm_bo_cleanup_refs. :-)
>>>>>>>>>>
>>>>>>>>>> I did add a WARN_ON(!atomic_read(&bo->kref.refcount)); to
>>>>>>>>>> ttm_bo_reserve and ttm_eu_reserve_buffers to be sure nothing is done 
>>>>>>>>>> on the device
>>>>>>>>>> itself. If that is too paranoid, those WARN_ON's could be dropped. I 
>>>>>>>>>> prefer to leave them
>>>>>>>>>> in for a kernel release or 2. But according to the rules that would 
>>>>>>>>>> be the only time you
>>>>>>>>>> could attach a new fence and trigger the WARN_ON for now..
>>>>>>>>> Hmm, I'd appreciate if you could group patches with functional 
>>>>>>>>> changes that depend on eachother togeteher,
>>>>>>>>> and "this is done because ...", which makes it much easier to review, 
>>>>>>>>> (and to follow the commit history in case
>>>>>>>>> something goes terribly wrong and we need to revert).
>>>>>>>>>
>>>>>>>>> Meanwhile I'll take a look at the final ttm_bo.c and see if I can 
>>>>>>>>> spot any culprits.
>>>>>>>>>
>>>>>>>>> In general, as long as a bo is on a LRU list, we must be able to 
>>>>>>>>> attach fences because of accelerated eviction.
>>>>>>>> I thought it was deliberately designed in such a way that it was kept 
>>>>>>>> on the lru list,
>>>>>>>> but since it's also on the ddestroy list it won't start accelerated 
>>>>>>>> eviction,
>>>>>>>> since it branches into cleanup_refs early, and lru_lock still protects 
>>>>>>>> all the list entries.
>>>>>>> I used bad wording. I meant that unbinding might be accelerated, but  
>>>>>>> currently (quite inefficiently)
>>>>>>> do synchronized unbinding, assuming that only the CPU can do that. When 
>>>>>>> we start to support
>>>>>>> unsynchronized moves, we need to be able to attach fences at least at 
>>>>>>> the last move_notify(bo, NULL);
>>>>>> Would you need to wait in that case on fence_wait being completed before 
>>>>>> calling move_notify?
>>>>>>
>>>>>> If not, you would still only need to perform one wait, but you'd have to 
>>>>>> make sure move_notify only gets
>>>>>> called by 1 thread before checking the fence pointer and performing a 
>>>>>> wait. At that point you still hold the
>>>>>> lru_lock though, so it shouldn't be too hard to make something safe.
>>>>> I think typically a driver that wants to implement asynchronous moves 
>>>>> don't want to wait before calling
>>>>> move_notify, but may wait in move_notify or move. Typically (upcoming 
>>>>> vmwgfx) it would invalidate the buffer in move_notify(bo, NULL), attach a 
>>>>> fence and then use the normal delayed destroy to wait on that fence 
>>>>> before destroying the buffer.
>>>>>
>>>>> Otherwise, since binds / unbinds are handled in the GPU command stream 
>>>>> there's never any need to wait for moves except when there's a CPU
>>>>> access.
>>>> Well, nouveau actually needs fence_wait to finish first, since vm changes 
>>>> are out of band.
>>>> But I guess it should be possible to attach it as work to the fence when 
>>>> it's signaled, and I
>>>> may want to do something like that already for performance reasons in a 
>>>> different place,
>>>> so I guess it doesn't matter.
>>> Actions to be performed on fence signaling tend to be very cpu consuming, I 
>>> think due to the context switches involved.
>>> We had to replace that in the old psb driver and batch things like TTM does 
>>> instead.
>>>
>>> Also remember that TTM fences are not required to signal in finite time 
>>> unless fence_flush is called.
>>>
>>> I think nouveau doesn't use fence irqs to signal its fences.
>>>
>>>> Is calling move_notify(bo, NULL) legal and a noop the second time?
>>> I see no fundamental reason why it shouldn't be OK, although we might need 
>>> to patch drivers to cope with it.
>>>
>>>>    That would save a flag in the bo to check if it's called already,
>>>> although I suppose we could always define a TTM_BO_PRIV_FLAG_* for it 
>>>> otherwise.
>>>>
>>>> move_notify might end up being called with the lru_lock held, but that 
>>>> shouldn't be a problem.
>>> I don't think that's a good idea. Drivers sleeping in move_notify will need 
>>> to release the spinlock, and that means it's
>>> better to release it before move_notify is called.
>> Is the only sleeping being done on fences? In that case we might wish to 
>> split it up in 2 pieces for destruction,
>> the piece that runs immediately, and a piece to run after the new fence has 
>> signaled (current behavior).
>>
>> Nouveau needs the final move_notify unmap to be called after object is idle, 
>> like it is now. It doesn't need
>> to attach a new fence.
>
> In that case it might be best to worry about asynchronous stuff later?
> We will eventually implement it on the new vmwgfx hardware revision, but it's 
> not ready yet.
>
> /Thomas
Ok sounds good.


In that case what do you want me to change from the first 4 patches apart from 
more verbose commit messages?
- 03/10 I got that I need to re-add the list_empty check after -EBUSY was 
returned in evict_mem_first.

Also PATCH 05/10 cleans up the spinning in ttm_bo_cleanup_refs, so I hope it's 
ok that it's a big
ugly in 04/10, as long as it doesn't result in any new bugs being introduced.

~Maarten

PS: I did a plain rebase of my git tree to deal with the conflicts in drm-next.

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [PATCH 04/10] drm/ttm: change fence_lock to inner lock, v3

Reply via email to