J. Roeleveld <jo...@antarean.org> wrote: > > With the kind of schedules I am working with (and I believe Alan will > also end up with), restarting the whole process from the start can > lead to issues. > Finding out how far the process got before the service crashed can become > rather complex.
I am not sure whether I understand this correctly: schedule has not a problem to display which tasks have finished/failed/are still running at any time. Of course, a finer granulation than tasks are not possible ("how far has a certain task got?") because this would require knowledge about the task and how to check it - you need to be able to split your tasks into more shell commands to make a finer granulation available for "schedule". You can just rerun your "driving" script with the effect that the tasks which already are finished/failed will actually not be restarted, but the behaviour is as if they would finish immediately and report that they are finished/failed. (When you plan to do this, I would suggest to schedule things like "sleep" as separate tasks, too, and not build them into the "driving" script.) If there is an unexpected problem, and e.g. you want to re-run a failed task anyway, you can just re-queue your new task on the same place as there was the previous task, e.g. schedule remove jobnr schedule -j jobnr queue commmand to do your task Then the old job (and its state) is replaced by the new queued job, and your (identical as before) driving script will start it instead of assuming that the job is already finished. In order to avoid races, I would recommend to do the above only while your driving script is not running (e.g., you can put it in the background with ctrl-z if you have written it in (...) or if it is really a "classical" script, and then continue it with "fg"; or you even stop it completely with Ctrl-c and re-run it, depending on what you want): The problem is that between the above two commands the jobs after "jobnr" are renumbered. Alternatively, you can insert your new job at the end of the joblist and then use something like (untested) schedule -jjobnr insert 0 jobnr+1:-1 schedule remove 0 to to re-sort your job list: The "insert" is race-free, and having added a job at the end for some time will hopefully not disturb anything.