On Friday 28 October 2005 12:00, Michele Baldessari wrote: > Since I've spent some time looking at bacula's and mt's source code to > figure out why just bacula was barfing on an empty tape drive on a > 2.6.x linux kernel, I think it might be worth adding the following to > the FAQ. (Apologies if something similar is in the docs already..I did > look but found nothing relevant). > > Note: I stumbled upon this issue while configuring bacula 1.36.2 on a > 2.6.x based system on an IBM 4560SLX tape library. > > thanks for all your work, > Michele Baldessari > > * Bacula, kernel 2.6.x and SCSI hangs when tape drive is empty > > Bacula 1.36.x does not open the device with O_NONBLOCK and thus isn't > able to cope with the kernel change in the st driver [1]. The solution > is either to upgrade bacula to a 1.37.x version or to use a 2.4.x kernel. > > [1] http://www.ussg.iu.edu/hypermail/linux/kernel/0302.2/0066.html : > """ > The open() behaviour of st was changed at 2.5.3 to conform with SUS > (blocking) and what the other Unices do (timeout). If the device is opened > without O_NONBLOCK, the driver waits for some time (default 2 minutes) > for the device to become ready. If it does not become ready, an error is > returned.
Do you know where the kernel documentation on this is and where I can find out more about SUS (whatever it means)? My experience here is that if there is no tape in the drive, you must open it in read-only mode and with O_NONBLOCK set. In that case, it will open the drive, and you can then try to figure out that there is no media present. This is a real nightmare for me (and Bacula) because the use of O_NONBLOCK seems to be very system dependent -- for example, on FreeBSD the ioctl() to clear the O_NONBLOCK apparently is not valid on a tape drive. As far as I can tell, this change has destroyed a number of functions such as the Polling in Bacula, and if you use "Offline on Unmount", and Bacula wants another tape, Bacula will retry opening the drive for 5 minutes (based on the idea that the open fails immediately), however since the open() blocks two minutes, Bacula will fail the job after 20 minutes. I wonder if the guys who changed this in the kernel were aware of the consequences. If anyone knows if and where the kernel guys publish these things, I sure would like to know. If I have to read through 50 million lines of kernel change log to find this information, which I cannot do, we will probably have more of these kinds of surprises in the future ... :-( -- Best regards, Kern ("> /\ V_V ------------------------------------------------------- SF.Net email is sponsored by: Tame your development challenges with Apache's Geronimo App Server. Download it for free - -and be entered to win a 42" plasma tv or your very own Sony(tm)PSP. Click here to play: http://sourceforge.net/geronimo.php _______________________________________________ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users