J. Roeleveld <jo...@antarean.org> wrote: > > Depends on the specific requirements. > If you want:
In a sense, most you require can be done with my mentioned "schedule" tool, although perhaps the usage is not in the way you expected. I reorder your points for a clearer explanation: > - have schedules operate over multiple machines (eg. part run on > database, some on a compute-cluster, some other bit making nice graphs > and printing it,...) Since "schedule" can use TCP for communication, this should not be a problem if you let "schedule-server" listen world-wide (export SCHEDULE_SERVER_OPTS=-a0.0.0.0) For the actual scheduling you must setup your machines correspondingly: Queue on one machine the task doing the database access you want (with "schedule -a[serveraddress] queue command_to_access_database") and similarly on the other machines. Of course, ssh or anything else can be used to do this without physically accessing the machines. Then, on one machine (not necessarily that of the server), you run an appropriate "driver" script. > - time based start of a schedule > - dependencies in said schedules and between schedules which can delay > the actual start > - stop of schedule if error occurs All this is not a problem, since the "driver" script is just a shell script which calls "schedule" to start the tasks, wait for them being finished and/or checking their exit status. This is perhaps inconvenient but has the advantage of being absolutely flexible: You can use all linux tools like "sleep" (or also use at or cron) to get any delays you want, do tests more powerful than checking the exit status etc. > - ability to restart schedule from crashed point Running non-yet started jobs after a crash is not a problem - you just edit your "driver" script appropriately and restart it. Jobs which were already running need to be re-queued if they should be running again.