Hello, On Monday 06 March 2006 19:48, Enrique de la Torre Gordaliza wrote: > Hi all! > > I'm a bit confused with "reschedule on error" behavior. Im running a 1.38.5 > server on a Xeon EM64T linux (GNU/Debian) server. > > > Ive add to my job configuration (my workstation is not always online): > > Rerun Failed Levels = yes > Reschedule on Error = yes > Reschedule Interval = 30 # (short time just for test) > Reschedule Times = 2 > Run Before Job = "nc -w5 -z XXX.XXX.XXX.XXX 9102" > > I stop file daemon and run this job from bconsole and wait few minutes. It > fails but get no mail about it (mail has been working for Backup Failed and > OK notifications, operator notifications too). Ive found some emty .mail > > files on working directory: > :/usr/local/bacula/bin/working# ls -1l *.mail > > -rw-r----- 1 bacula bacula 0 2006-03-06 18:42 > neptuno-dir.Sunipx1.2006-03-06_18.42.31.8031880.mail > > corresponding to this job. If I "status dir" at bconsole: > > Running Jobs: > JobId Level Name Status > ====================================================================== > 328 Increme Sunipx1.2006-03-06_18.42.31 has a fatal error > ==== > > I cannot clean this "fatal error". No cancel or delete command can. It > seems it's not the usual behavior of reschedule on error, is it? Is it a > configuration problem? a compiler problem (its 64 bit on linux but compiled > without -O2)? > > Logs and trace dont show any suspicious error... If I try > > Rerun Failed Levels = yes > # Reschedule on Error = yes > # Reschedule Interval = 30 # (short time just for test) > # Reschedule Times = 2 > Run Before Job = "nc -w5 -z XXX.XXX.XXX.XXX 9102" > > it works great with mail notification about RunBeforeJob exit status so it > seems just? a "Reschedule on Error" issue. >
First, rescheduling has always been a bit fragile, and I haven't yet made a regression script to test it, so there could well be a bug in 1.38.5. That said, I don't really see a problem here. When the job fails, it probably puts a message in the job report, but the job is then rescheduled, and waits for the reschedule interval to expire. While it is waiting, there is no way to kill it off (in fact, if you do, perhaps it will get confused as this something I did not test). Only when it is again running will you be able to cancel it. The only thing that I see a bit weird is that you apparently have a 30 second restart period. However, if you modified this value without stopping and restarting the Director it will not be valid (i.e. even if you do a reload, the value may not change). Anyway, if you are really sure that this is not working, it would be worth a bug report. I believe that several users *are* successfully using rescheduling though ... ------------------------------------------------------- This SF.Net email is sponsored by xPML, a groundbreaking scripting language that extends applications into web and mobile media. Attend the live webcast and join the prime developer group breaking into this new coding territory! http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642 _______________________________________________ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users