* Kern Sibbald ([EMAIL PROTECTED]) wrote: > On Friday 28 October 2005 12:00, Michele Baldessari wrote: > > Since I've spent some time looking at bacula's and mt's source code to > > figure out why just bacula was barfing on an empty tape drive on a > > 2.6.x linux kernel, I think it might be worth adding the following to > > the FAQ. (Apologies if something similar is in the docs already..I did > > look but found nothing relevant). > > > > Note: I stumbled upon this issue while configuring bacula 1.36.2 on a > > 2.6.x based system on an IBM 4560SLX tape library. > > > > thanks for all your work, > > Michele Baldessari > > > > * Bacula, kernel 2.6.x and SCSI hangs when tape drive is empty > > > > Bacula 1.36.x does not open the device with O_NONBLOCK and thus isn't > > able to cope with the kernel change in the st driver [1]. The solution > > is either to upgrade bacula to a 1.37.x version or to use a 2.4.x kernel. > > > > [1] http://www.ussg.iu.edu/hypermail/linux/kernel/0302.2/0066.html : > > """ > > The open() behaviour of st was changed at 2.5.3 to conform with SUS > > (blocking) and what the other Unices do (timeout). If the device is opened > > without O_NONBLOCK, the driver waits for some time (default 2 minutes) > > for the device to become ready. If it does not become ready, an error is > > returned.
Hi Kern, > Do you know where the kernel documentation on this is and where I can find > out > more about SUS (whatever it means)? SUS is the Single Unix Specification. You should be able to find it here: http://www.unix.org/online.html (It's free but you need to register) > My experience here is that if there is no tape in the drive, you must open it > in read-only mode and with O_NONBLOCK set. In that case, it will open the > drive, and you can then try to figure out that there is no media present. > This is a real nightmare for me (and Bacula) because the use of O_NONBLOCK > seems to be very system dependent -- for example, on FreeBSD the ioctl() to > clear the O_NONBLOCK apparently is not valid on a tape drive. Ouch. > As far as I can tell, this change has destroyed a number of functions such as > the Polling in Bacula, and if you use "Offline on Unmount", and Bacula wants > another tape, Bacula will retry opening the drive for 5 minutes (based on the > idea that the open fails immediately), however since the open() blocks two > minutes, Bacula will fail the job after 20 minutes. > > I wonder if the guys who changed this in the kernel were aware of the > consequences. The only reference I had found was the one I quoted in my original mail unfortunately, and it only claimed more conformance to SUS. > If anyone knows if and where the kernel guys publish these things, I sure > would like to know. If I have to read through 50 million lines of kernel > change log to find this information, which I cannot do, we will probably have > more of these kinds of surprises in the future ... :-( I'm afraid these things are to be hunted down in changelogs and lists. The change took place quite some time ago though (2.5.x). Maybe the best way out of this is to create a test program and ask around the lists to launch it on a bunch of different OSes? hth, Michele
signature.asc
Description: Digital signature