Blair Zajac <bl...@orcaware.com> writes: >> --- subversion/libsvn_fs_fs/fs_fs.c (revision 1063629) >> +++ subversion/libsvn_fs_fs/fs_fs.c (working copy) >> @@ -583,6 +583,14 @@ >> err = body(baton, subpool); >> } >> >> + if (!strcmp(lock_filename, path_txn_current_lock(fs, pool))) >> + { >> + int i, j, k = 0; >> + for (i = 0; i< 200000; ++i) >> + for (j = 0; j< 10000; ++j) >> + k += j + i; >> + } > > I'm guessing a sleep would also work well here?
I was deliberately trying to keep the server in userspace not kernel space, but it appears to make little difference in practice. I've also moved the code a little earlier, into the if (!err) block, which makes the behaviour easier to follow. The fourth commit now fails immediately while the other three still complete. >> + >> svn_pool_destroy(subpool); >> >> #if SVN_FS_FS__USE_LOCK_MUTEX >> >> I start the four commits above, one after the other with a small delay >> between them. The commits run in parallel and they all block, then >> after about 10 seconds the first two complete, and then the third fails >> with: >> >> svn: Can't get exclusive lock on file >> '/home/pm/sw/subversion/obj/repoA/db/txn-current-lock': Resource deadlock >> avoided >> >> Then after a bit more delay the fouth completes. >> >> I didn't expect the failed commit. On the other hand it appears to be >> handled gracefully: the server keeps running and the client gets >> notified. > > Yes, we don't see any issues with the repository after a failed lock. > > I think there's a deadlock because the kernel assumes that the process > is single threaded. It doesn't know that the other, non-blocked, > threads will make progress and release the lock. It appears that the kernel deadlock detection sometimes produces false positives but that it is essentially harmless: the user simply retries the commit. I guess it would be safe to have the server implement some sort of retry loop. -- Philip