Hello Bill,

By the way, I have just committed another patch for the problem of marking all volumes read-only.  If you have time please test the latest code in the repo, I think it should correct the last note you put into your bug #2329.

The case you cite below looks to me like Bacula is behaving as designed.  Basically if the device is not there Bacula makes a few passes at trying to find it then simply fails the job.  I am not sure how Bacula would "trap" this sort of a situation, and for me it really does not make sense for Bacula to notify the operator because it is not a simple mount request.  The operator will notice the problem when the job fails.  Bottom line: if devices defined in the SD are not there, Bacula will after trying a few times fail the jobs. 

If you have a good idea on some other action, I am willing to listen.

Best regards,

Kern


On 10/28/2017 06:54 PM, Bill Arlofski wrote:
On 10/28/2017 10:10 AM, Phil Stracchino wrote:
On 10/28/17 04:15, Kern Sibbald wrote:
Hello,

Thanks for the feedback.  Can you confirm that your Bacula signs on with 
version 9.0.5?  If so, it means that some recent patches that I have 
made for this problem (3-4 bug reports) solve the problem :-)
I will definitely download and test.
Hi Kern,

Version 9.0.5 from git seems to mitigate this issue. (with a caveat or two).

In my quick test here, I set up a restore job knowing full well that the
storage array with the volumes required to do the restore was powered off -
hoping to force Bacula to ask the operator for a volume and then wait. :)

Bacula attempted to reserve, and then access each of the 6 disk devices in the
autochanger. Of course it could not open any of them because the array was
off/dismounted. It properly warned me for each device, then tried to loop
through the 6 drives 3 more times (for a total of 4 loops), and then it marked
the job as:

"is waiting on Storage "aoe-file"

So far so good.

However, a few things:

1. It performed this loop every 30 seconds generating a lot of logging
2. It never mailed the operator to ask for a volume.
3. After pretty much exactly 10 minutes, the job was failed and a normal "job
failure" email was sent to the admin.

This second one might be the correct behavior since it is not trying to find a
volume, it simply cannot access the defined drive devices -- something to
think about... Should an operator be notified when a device cannot be opened
by the SD?

Attached is the first and last loop and job summary since it would just wrap
horribly in this email. :)


Kern, I understand that this test I just did may be a corner case and may just
be throwing a monkey wrench into the mix, but similar scenarios have been seen
in BEE Support, so it might be sensible to trap for this.

Best regards,

Bill





------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot


_______________________________________________
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

Reply via email to