On 24 December 2015 at 20:15, Jeff Janes <jeff.ja...@gmail.com> wrote:
> On Wed, Dec 23, 2015 at 9:40 PM, Jeff Janes <jeff.ja...@gmail.com> wrote: > > On Wed, Sep 23, 2015 at 11:33 PM, Jeff Janes <jeff.ja...@gmail.com> > wrote: > >> > >> On further thought, neither do I. The attached patch inverts > >> ResolveRecoveryConflictWithLock to be called back from the lmgr code so > that > >> is it like ResolveRecoveryConflictWithBufferPin code. It does not try > to > >> cancel the conflicting lock holders from the signal handler, rather it > just > >> loops an extra time and cancels the transactions on the next call. > >> > >> It looks like the deadlock detection is adequately handled within normal > >> lmgr code within the back-ends of the other parties to the deadlock, so > I > >> didn't do a timeout for deadlock detection purposes. > > > > That is how I've done it. > It's taken me a while to figure this out. My testing showed a bug in disable_timeout(), which turns out to be a double-disable, which I've fixed. I'll submit a different patch to put in some diagnostics if such cases show up again, which could happen now we have user-defined timeouts. What surprises me is that I can't see this patch ever worked as submitted, when run on an assert-enabled build. If you want this backpatched, please submit versions that apply cleanly and test them. I'm less inclined to do that myself, just regard this as an improvement. -- Simon Riggs http://www.2ndQuadrant.com/ <http://www.2ndquadrant.com/> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services