Blair Zajac <bl...@orcaware.com> writes:

>> --- subversion/libsvn_fs_fs/fs_fs.c  (revision 1063629)
>> +++ subversion/libsvn_fs_fs/fs_fs.c  (working copy)
>> @@ -583,6 +583,14 @@
>>         err = body(baton, subpool);
>>       }
>>
>> +  if (!strcmp(lock_filename, path_txn_current_lock(fs, pool)))
>> +    {
>> +      int i, j, k = 0;
>> +      for (i = 0; i<  200000; ++i)
>> +        for (j = 0; j<  10000; ++j)
>> +          k += j + i;
>> +    }
>
> I'm guessing a sleep would also work well here?

I was deliberately trying to keep the server in userspace not kernel
space, but it appears to make little difference in practice.

I've also moved the code a little earlier, into the if (!err) block,
which makes the behaviour easier to follow.  The fourth commit now fails
immediately while the other three still complete.

>> +
>>     svn_pool_destroy(subpool);
>>
>>   #if SVN_FS_FS__USE_LOCK_MUTEX
>>
>> I start the four commits above, one after the other with a small delay
>> between them.  The commits run in parallel and they all block, then
>> after about 10 seconds the first two complete, and then the third fails
>> with:
>>
>> svn: Can't get exclusive lock on file 
>> '/home/pm/sw/subversion/obj/repoA/db/txn-current-lock': Resource deadlock 
>> avoided
>>
>> Then after a bit more delay the fouth completes.
>>
>> I didn't expect the failed commit.  On the other hand it appears to be
>> handled gracefully: the server keeps running and the client gets
>> notified.
>
> Yes, we don't see any issues with the repository after a failed lock.
>
> I think there's a deadlock because the kernel assumes that the process
> is single threaded.  It doesn't know that the other, non-blocked,
> threads will make progress and release the lock.

It appears that the kernel deadlock detection sometimes produces false
positives but that it is essentially harmless: the user simply retries
the commit.

I guess it would be safe to have the server implement some sort of retry
loop.

-- 
Philip

Reply via email to