If you go thru the logs on minotaur (and they are the same for eos eg grep stale --context 10) you will see the connect() is throwing an (unknown) exception causing the daemon to shut down.
What I think is behind this is that every so often the FreeBSD nic drivers stall for up to a minute before resetting themselves. Sometimes a simple connect will work, other times it takes a few more tries. Feel free to adjust the logic to taste, I have nothing but little patience for the mass of Exception classes to deal with coming from perl where everything is a string. >________________________________ > From: Greg Stein <gst...@gmail.com> >To: dev@subversion.apache.org; Joe Schaefer <joe_schae...@yahoo.com> >Sent: Sunday, March 11, 2012 11:44 PM >Subject: Re: svn commit: r1299519 - >/subversion/trunk/tools/server-side/svnpubsub/svnpubsub/client.py > > >What did you observe here, and do you have some logs that demonstrate the >problem? I tested the reconnect logic, and it all appeared to work, so maybe >something else is going on here. The code below blocks the network processing. >The event should also be a simple word, so the callback can switch on it. In >this case, it can call logging.exception() so we can see what was caught. Bare >"except:" clauses are always dangerous since it could paper over even a simple >typo on an attribute access. >Thx, >-g >On Mar 11, 2012 10:03 PM, <j...@apache.org> wrote: > > >