Hello, As you know, current job scheduling has a few deficiencies, particular if for some reason your backups get blocked (a bad tape driver or operator intervention required), which can lead to a big pile of duplicate jobs being scheduled.
We have previously discussed ways of fixing this, with some really good ideas. I am now ready to take a stab at implementing it, and would like to present the current design and let some of you help in the design process. I am currently pretty busy with my own project and helping with two major projects that are making very nice progress, so I would appreciate some input. My current idea is to create a new "DuplicateJobs" resource and a new Duplicate Jobs directive which would point to the duplicate jobs resource. The reason for the resource is that there are just too many different variations that it would require a lot of new directives, and it seems a shame to add them to every Job. My current design calls for a Duplicate Jobs resource that looks something like the following: DuplicateJobs { Name = "xxx" Allow = yes|no (no = default) AllowHigherLevel = yes|no (no) AllowLowerLevel = yes|no (no) AllowSameLevel = yes|no Cancel = Running | New (no) CancelledStatus = Fail | Skip (fail) Job Proximity = <time-interval> (0) } The first "Allow" directive is probably not needed, but it does make it more complete. If this directive is set to yes, all the other directives would be ignored, which would be the same as today and with no Duplicate Jobs directive in the Job resource. The AllowXXX directives are to try to define what job will be allowed to continue when there is one job running or waiting and a new one arrives. For example AllowHigherLevel = yes, would mean to allow the higher level job to continue. The Cancel directive specifies which job to cancel (the new job or the job already there. I think there is probably a logic conflict between this directive and the AllowXXX directives, but I have not thought this through carefully enough. The CancelledStatus is an attempt to tell Bacula to either fail one of the two jobs or to Skip it, which means to kill it but without a lot of noise. Some options I could think of here that are not yet clearly specified are: Do not kill a running job in favor of a newly scheduled job. Do not print any messages about cancelling a job (I don't particularly like this idea). Do not record any cancelled job in the catalog ... Finally Job Proximity is to allow a bit of overlap. For example, if a job has been running 20 minutes or ran 20 minutes ago, you might want to not apply the rules. As you can see, there is a lot of room for clarification of what should be done, and also a need for a bit more functionality ... -- in other words a bit more design is needed before beginning the implementation. Comments? Best regards, Kern ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ _______________________________________________ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users