Am 26.04.2011 21:55, schrieb Hans Georg Schaathun:

Now, I would like to use remote hosts as well, more precisely, student
lab boxen which are rather unreliable.  By experience I'd expect to
lose roughly 4-5 jobs in 100 CPU hours on average.  Thus I need some
way of detecting lost connections and requeue unfinished tasks,
avoiding any serious delays in this detection.  What is the best way to
do this in python?

As far as I understand, you acquire a job, send it to a remote host via a socket and then wait for the answer. Is that correct?

In this case, I would put running jobs together with the respective socket in a "running queue". If you detect a broken connection, put that job into the "todo" queue again.


... if I could detect disconnects and
requeue the tasks from the networking threads.  Is that possible
using python sockets?

Of course, why not? It might depend on some settings you set (keepalive etc.); but generally you should get an exception when trying a communication over a disconnected connection (over a disconnection? ;-))

When going over tne network, aviod pickling. Better use an own protocol.


Thomas
--
http://mail.python.org/mailman/listinfo/python-list

Reply via email to