In article <[EMAIL PROTECTED]>, "Paddy" <[EMAIL PROTECTED]> writes: |> |> > |> Three to four months before `strange errors`? I'd spend some time |> > |> correlating logs; not just for your program, but for everything running |> > |> on the server. Then I'd expect to cut my losses and arrange to safely |> > |> re-start the program every TWO months. |> > |> (I'd arrange the re-start after collecting logs but before their |> > |> analysis. Life is too short). |> > |> > Forget it. That strategy is fine in general, but is a waste of time |> > where threading issues are involved (or signal handling, or some types |> > of communication problem, for that matter). |> |> Nah, Its a great strategy. it keeps you up and running when all you |> know for sure is that you will most likely be able to keep things |> together for three months normally. |> |> The OP only thinks its a threading problem - it doesn't matter what the |> true fix will be, as long as arranging to re-start the server well ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |> before its likely to go down doesn't take too long, compared to your ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |> exploration of the problem, and, of course, you have to be able to |> afford the glitch in availability.
Consider the marked phrase in the context of a Poisson process failure model, and laugh. If you don't understand why I say that, I suggest finding out the properties of the Poisson process! Regards, Nick Maclaren. -- http://mail.python.org/mailman/listinfo/python-list