On 10/12/2012 12:09 PM, Maarten Lankhorst wrote:
Op 12-10-12 07:57, Thomas Hellstrom schreef:
On 10/11/2012 10:55 PM, Maarten Lankhorst wrote:
Op 11-10-12 21:26, Thomas Hellstrom schreef:
On 10/11/2012 08:42 PM, Maarten Lankhorst wrote:

Anyway, if you plan to remove the fence lock and protect it with reserve, you 
must
make sure that a waiting reserve is never done in a destruction path. I think 
this
mostly concerns the nvidia driver.
Well I don't think any lock should ever be held during destruction time,
What I mean is, that *something* needs to protect the fence pointer. Currently 
it's the
fence lock, and I was assuming you'd protect it with reserve. And neither TTM 
nor
Nvidia should, when a resource is about to be freed, be forced to *block* 
waiting for
reserve just to access the fence pointer. When and if you have a solution that 
fulfills
those requirements, I'm ready to review it.
It's not blocking, cleanup_refs_or_queue will toss it on the deferred list if 
reservation fails,
behavior doesn't change just because I changed the order around.
Well, I haven't looked into the code in detail yet. If you say it's 
non-blocking I believe you.
I was actually more concerned abut the Nvidia case where IIRC the wait was 
called both
with and without reservation.


- no_wait_reserve is ignored if no_wait_gpu is false
      ttm_bo_reserve_locked can only return true if no_wait_reserve is true, but
      subsequently it will do a wait_unreserved if no_wait_gpu is false.
I'm planning on removing this argument and act like it is always true, since
nothing on the lru list should fail to reserve currently.
Yes, since all buffers that are reserved are removed from the LRU list, there
should never be a waiting reserve on them, so no_wait_reserve can be removed
from ttm_mem_evict_first, ttm_bo_evict and possibly other functions in the call 
chain.
I suppose there will stay a small race though,
Hmm, where?
When you enter the ddestroy path, you drop the lock and hope the buffer doesn't 
reserved
away from under you.
Yes, that code isn't fully correct, it's missing a check for still on ddestroy 
after a waiting
reserve. However, the only chance of a waiting reserve given that the buffer 
*IS* on the
ddestroy list is if the current reserver returned early because someone started 
an
accelerated eviction which can't happen currently. The code needs fixing up 
though.

- effectively unlimited callchain between some functions that all go through
      ttm_mem_evict_first:

                                        /------------------------\
ttm_mem_evict_first - ttm_bo_evict -                          -ttm_bo_mem_space 
 - ttm_bo_mem_force_space - ttm_mem_evict_first
                                        \ ttm_bo_handle_move_mem /
I'm not surprised that there was a deadlock before, it seems to me it would
be pretty suicidal to ever do a blocking reserve on any of those lists,
lockdep would be all over you for this.
Well, at first this may look worse than it actually is. The driver's eviction 
memory order determines the recursion depth
and typically it's 0 or 1, since subsequent ttm_mem_evict_first should never 
touch the same LRU lists as the first one.
What would typically happen is that a BO is evicted from VRAM to TT, and if 
there is no space in TT, another BO is evicted
to system memory, and the chain is terminated. However a driver could set up 
any eviction order but that would be
a BUG.

But in essence, as you say, even with a small recursion depth, a waiting 
reserve could cause a deadlock.
But there should be no waiting reserves in the eviction path currently.
Partially true, ttm_bo_cleanup_refs is currently capable of blocking reserve.
Fixing this might mean that ttm_mem_evict_first may need to become more 
aggressive,
since it seems all the callers of this function assume that ttm_mem_evict_first 
can only fail
if there is really nothing more to free and blocking nested would really upset 
lockdep
and leave you open to the same deadlocks.
I can't see how the waiting reserve in ttm_bo_cleanup_refs would cause a 
deadlock,
because the buffer about to be reserved is always *last* in a reservation 
sequence, and the
reservation is always released (or the buffer destroyed) before trying to 
reserve another buffer.
Technically the buffer is not looked up from a LRU list but from the delayed 
delete list.
Could you describe such a deadlock case?
The only interesting case for this is ttm_mem_evict_first, and while it may not 
technically
be a deadlock, lockdep will flag you for blocking on this anyway, since the 
only reason it
would not be a deadlock is if you know the exact semantics of why.
Interesting. I guess that must be because of the previous reservation history 
for that buffer?
Let's say we were to reinitialize the lockdep history for the reservation 
object when it was put
on the ddestroy list, I assume lockdep would keep quiet, because there are 
never any other
bo reservations while such a buffer is reserved?
Lockdep works on classes of lock, not necessarily individual locks.

Doing 2 bo_reserve's of any bo would count as possible deadlock, no matter if 
you
always take them in a certain order or not.

So you mean that if I bo_reserve A and then bo_reserve B (which is used only when binding A to the GPU), lockdep will complain even if nobody ever bo_reserves B before A? That will make it impossible to use BOs as page tables for GPU binding for example.


To make multi-object reservation work, the fix is to add a ticket "lock" of 
which all the
reservation objects are a nested lock of. Since in this case the ticket lock 
would prevent
deadlocks, this is acceptable.  Having 2 ticket 'locks' at the same time would 
count as
deadlock, rightfully. If you hold a reservation from a ticket, then try to 
reserve without
a ticket, it counts as deadlock too. See below for some examples I was using to 
test.

But if a ticket lock can be used to abstract a locking sequence that we know will never deadlock, why can't we abstract locking from a list with atomic list removal with another "lock", and make sure we get the order right between them?


Since it's counted as a normal lock, normal lockdep rules regarding locking 
apply, so if
you hold a lock then take a reservation, and then do it the other way around it 
is counted
as a potential deadlock.

Also you can't simply reset the history for a single object because it acts on 
classes of
locks, not individual locks. Resetting the state would mean lockdep gets 
thoroughly
confused since it no longer knows about currently held reservations by any task 
or any
cpu, so please don't.

~Maarten

Below are some tests I was doing to show the different kind  of things lockdep 
will catch.
I used them to do some selftests on reservation objects' lockdep.

I'll take a look and see if I can digest this :)

/Thomas



Doing spinlock, ticket, try, block tests, the spinlock tests are inverting lock 
between
spinlock and the other possibilities. Because the reservation is a locktype, 
those tests
will fail, FAILURE that lockdep caught an error. SUCCESS means lockdep thinks 
what you do
is ok:

syntax for object_reserve:
int object_reserve(object, interruptible, no_wait, ticket);

static void reservation_test_fail_reserve(void)
{
        struct reservation_ticket t;
        struct reservation_object o;

        reservation_object_init(&o);
        reservation_ticket_init(&t);
        t.seqno++;

        object_reserve(&o, false, false, &t);
        /* No lockdep test, pure API */
        WARN_ON(object_reserve(&o, false, true, &t) != -EDEADLK);
        t.seqno--;
        WARN_ON(object_reserve(&o, false, true, &t) != -EBUSY);
        t.seqno += 2;
        WARN_ON(object_reserve(&o, false, true, &t) != -EAGAIN);
        object_unreserve(&o, NULL);

        reservation_ticket_fini(&t);
}

static void reservation_test_two_tickets(void)
{
        struct reservation_ticket t, t2;

        reservation_ticket_init(&t);
        reservation_ticket_init(&t2);

        reservation_ticket_fini(&t2);
        reservation_ticket_fini(&t);
}

static void reservation_test_ticket_unreserve_twice(void)
{
        struct reservation_ticket t;

        reservation_ticket_init(&t);
        reservation_ticket_fini(&t);
        reservation_ticket_fini(&t);
}

static void reservation_test_object_unreserve_twice(void)
{
        struct reservation_object o;

        reservation_object_init(&o);
        object_reserve(&o, false, false, NULL);
        object_unreserve(&o, NULL);
        object_unreserve(&o, NULL);
}

static void reservation_test_fence_nest_unreserved(void)
{
        struct reservation_object o;

        reservation_object_init(&o);

        spin_lock_nest_lock(&lock_A, &o);
        spin_unlock(&lock_A);
}

static void reservation_test_ticket_block(void)
{
        struct reservation_ticket t;
        struct reservation_object o, o2;

        reservation_object_init(&o);
        reservation_object_init(&o2);
        reservation_ticket_init(&t);

        object_reserve(&o, false, false, &t);
        object_reserve(&o2, false, false, NULL);
        object_unreserve(&o2, NULL);
        object_unreserve(&o, &t);

        reservation_ticket_fini(&t);
}

static void reservation_test_ticket_try(void)
{
        struct reservation_ticket t;
        struct reservation_object o, o2;

        reservation_object_init(&o);
        reservation_object_init(&o2);
        reservation_ticket_init(&t);

        object_reserve(&o, false, false, &t);
        object_reserve(&o2, false, true, NULL);
        object_unreserve(&o2, NULL);
        object_unreserve(&o, &t);

        reservation_ticket_fini(&t);
}

static void reservation_test_ticket_ticket(void)
{
        struct reservation_ticket t;
        struct reservation_object o, o2;

        reservation_object_init(&o);
        reservation_object_init(&o2);
        reservation_ticket_init(&t);

        object_reserve(&o, false, false, &t);
        object_reserve(&o2, false, false, &t);
        object_unreserve(&o2, &t);
        object_unreserve(&o, &t);

        reservation_ticket_fini(&t);
}

static void reservation_test_try_block(void)
{
        struct reservation_object o, o2;

        reservation_object_init(&o);
        reservation_object_init(&o2);

        object_reserve(&o, false, true, NULL);
        object_reserve(&o2, false, false, NULL);
        object_unreserve(&o2, NULL);
        object_unreserve(&o, NULL);
}

static void reservation_test_try_try(void)
{
        struct reservation_object o, o2;

        reservation_object_init(&o);
        reservation_object_init(&o2);

        object_reserve(&o, false, true, NULL);
        object_reserve(&o2, false, true, NULL);
        object_unreserve(&o2, NULL);
        object_unreserve(&o, NULL);
}

static void reservation_test_try_ticket(void)
{
        struct reservation_ticket t;
        struct reservation_object o, o2;

        reservation_object_init(&o);
        reservation_object_init(&o2);

        object_reserve(&o, false, true, NULL);
        reservation_ticket_init(&t);

        object_reserve(&o2, false, false, &t);
        object_unreserve(&o2, &t);
        object_unreserve(&o, NULL);

        reservation_ticket_fini(&t);
}

static void reservation_test_block_block(void)
{
        struct reservation_object o, o2;

        reservation_object_init(&o);
        reservation_object_init(&o2);

        object_reserve(&o, false, false, NULL);
        object_reserve(&o2, false, false, NULL);
        object_unreserve(&o2, NULL);
        object_unreserve(&o, NULL);
}

static void reservation_test_block_try(void)
{
        struct reservation_object o, o2;

        reservation_object_init(&o);
        reservation_object_init(&o2);

        object_reserve(&o, false, false, NULL);
        object_reserve(&o2, false, true, NULL);
        object_unreserve(&o2, NULL);
        object_unreserve(&o, NULL);
}

static void reservation_test_block_ticket(void)
{
        struct reservation_ticket t;
        struct reservation_object o, o2;

        reservation_object_init(&o);
        reservation_object_init(&o2);

        object_reserve(&o, false, false, NULL);
        reservation_ticket_init(&t);

        object_reserve(&o2, false, false, &t);
        object_unreserve(&o2, &t);
        object_unreserve(&o, NULL);

        reservation_ticket_fini(&t);
}

static void reservation_test_fence_block(void)
{
        struct reservation_object o;

        reservation_object_init(&o);
        spin_lock(&lock_A);
        spin_unlock(&lock_A);

        object_reserve(&o, false, false, NULL);
        spin_lock(&lock_A);
        spin_unlock(&lock_A);
        object_unreserve(&o, NULL);

        spin_lock(&lock_A);
        object_reserve(&o, false, false, NULL);
        object_unreserve(&o, NULL);
        spin_unlock(&lock_A);
}

static void reservation_test_fence_try(void)
{
        struct reservation_object o;

        reservation_object_init(&o);
        spin_lock(&lock_A);
        spin_unlock(&lock_A);

        object_reserve(&o, false, true, NULL);
        spin_lock(&lock_A);
        spin_unlock(&lock_A);
        object_unreserve(&o, NULL);

        spin_lock(&lock_A);
        object_reserve(&o, false, true, NULL);
        object_unreserve(&o, NULL);
        spin_unlock(&lock_A);
}

static void reservation_test_fence_ticket(void)
{
        struct reservation_ticket t;
        struct reservation_object o;

        reservation_object_init(&o);
        spin_lock(&lock_A);
        spin_unlock(&lock_A);

        reservation_ticket_init(&t);

        object_reserve(&o, false, false, &t);
        spin_lock(&lock_A);
        spin_unlock(&lock_A);
        object_unreserve(&o, &t);

        spin_lock(&lock_A);
        object_reserve(&o, false, false, &t);
        object_unreserve(&o, &t);
        spin_unlock(&lock_A);

        reservation_ticket_fini(&t);
}

static void reservation_tests(void)
{
        printk("  
--------------------------------------------------------------------------\n");
        printk("  | Reservation tests |\n");
        printk("  ---------------------\n");

        print_testname("reservation api failures");
        dotest(reservation_test_fail_reserve, SUCCESS, LOCKTYPE_RESERVATION);
        printk("\n");

        print_testname("reserving two tickets");
        dotest(reservation_test_two_tickets, FAILURE, LOCKTYPE_RESERVATION);
        printk("\n");

        print_testname("unreserve ticket twice");
        dotest(reservation_test_ticket_unreserve_twice, FAILURE, 
LOCKTYPE_RESERVATION);
        printk("\n");

        print_testname("unreserve object twice");
        dotest(reservation_test_object_unreserve_twice, FAILURE, 
LOCKTYPE_RESERVATION);
        printk("\n");

        print_testname("spinlock nest unreserved");
        dotest(reservation_test_fence_nest_unreserved, FAILURE, 
LOCKTYPE_RESERVATION);
        printk("\n");

        printk("  -----------------------------------------------------\n");
        printk("                                 |block | try  |ticket|\n");
        printk("  -----------------------------------------------------\n");

        print_testname("ticket");
        dotest(reservation_test_ticket_block, FAILURE, LOCKTYPE_RESERVATION);
        dotest(reservation_test_ticket_try, SUCCESS, LOCKTYPE_RESERVATION);
        dotest(reservation_test_ticket_ticket, SUCCESS, LOCKTYPE_RESERVATION);
        printk("\n");

        print_testname("try");
        dotest(reservation_test_try_block, FAILURE, LOCKTYPE_RESERVATION);
        dotest(reservation_test_try_try, SUCCESS, LOCKTYPE_RESERVATION);
        dotest(reservation_test_try_ticket, FAILURE, LOCKTYPE_RESERVATION);
        printk("\n");

        print_testname("block");
        dotest(reservation_test_block_block, FAILURE, LOCKTYPE_RESERVATION);
        dotest(reservation_test_block_try, SUCCESS, LOCKTYPE_RESERVATION);
        dotest(reservation_test_block_ticket, FAILURE, LOCKTYPE_RESERVATION);
        printk("\n");

        print_testname("spinlock");
        dotest(reservation_test_fence_block, FAILURE, LOCKTYPE_RESERVATION);
        dotest(reservation_test_fence_try, SUCCESS, LOCKTYPE_RESERVATION);
        dotest(reservation_test_fence_ticket, FAILURE, LOCKTYPE_RESERVATION);
        printk("\n");
}



_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

Reply via email to