On Tue, Jan 25, 2011 at 11:00:31PM -0800, Blair Zajac wrote: > We're seeing deadlocks in our Subversion multithreaded server when > two distinct processes try to fcntl(F_SETLKW) on two fsfs > repositories' db/txn-current-lock, when the processes begin > transactions in reverse order. > > Process 1 Process 2 > --------- --------- > thread 1: begin txn in repos A thread 1: being txn in repos B > thread 2: begin txn in repos B thread 2: begin txn in repos A > > During normal working hours, we get over 1 commit per second, > peaking at 6, which is why we're seeing this. > > Questions: > > Should a fix for this be put in libsvn_fs_fs() or should I do this > in my application? I'm thinking putting this in libsvn_fs_fs() is > an appropriate fix, even though other people probably won't see it. > > I'm also thinking the code should retry a maximum of 100 times with > a 1ms sleep, doubling each sleep upon failure to a maximum 128 ms, > such as WIN32_RETRY_LOOP. > > Comments?
If possible it should be fixed in libsvn_fs_fs. Stefan