Hello, On Monday 14 November 2005 00:47, Steve Ellis wrote: > Kern Sibbald said: > > Hello again, > > > > You didn't by any chance recently upgrade from a 2.4 kernel to a 2.6 > > kernel > > did you? I am seeing all kinds of hangs and other funny behavior in > > the Storage daemon due to the change in the behavior of the open() call > > for tape > > drives from one kernel to another. > > Thanks for looking at this so quickly Kern-
Well, if something is fundamentally broken, I would like to fix it, and I'm just now testing 1.38.1 for release. > > No, I am running a 2.6 kernel, but I have been running it for 18 months or > so. I'm running a vintage Fedora Core 2 release--too lazy (and afraid) to > upgrade on this system that is critical to my home network. There has not > been a new Core2 kernel in quite some time--my last kernel upgrade was in > March, which I'm positive I was running, at least by August (I know I > rebooted about that time). > > I'm a networking software engineer, so although I have a lot of capability > to maintain, fix and debug a lot of stuff here at home, I don't have much > in the way of spare time--consequently, I tend to keep using things if > they are still working. I did want to switch to Bacuala 1.38, LTO2 and > Fedora Core4, but have so far only done the first upgrade (bacula). I saw > messages on bacula-users about recent 2.6 changes, and was hoping that any > dust would have settled by the time I got there (presumably when I get > around to FC4--or FC5, if I continue to put it off any longer). > > If it would help, I can turn on some sd logging, or something. The poll > interval suggestion will probably work for me for now, especially once I > get the LTO2 drive online, making nearly all of my backups a 1 tape > affair. After looking into this a bit here (I still have more testing to do), I am more and more convinced that your problem is due to the kernel change. Basically, what I see is that if there is no tape in the drive, the open() call blocks either in the OS or in Bacula code, it then fails at some point, and your job is terminated. The old behavior of the OS was to always permit open() on the drive regardless of whether or not there was a tape in it. I don't know when the change occurred -- i.e. what version of the kernel. Given the new kernel development mode, it is very likely that it came during one of the various 2.6.x releases. There is a certain logic in what they have changed, but IMO, it is a perverse way of dealing with the situation (no tape in the drive), and will cause all kinds of problems. If increasing the poll time works for you, OK, but after the tests I did here, I don't really think it will work. The real fix is going to take a major redesign of Bacula, which currently expects to always open a drive, and when it cannot, it fails the job. There are two workarounds for this situation that I see at the current time: 1. Remove the "Offline on Unmount" this will leave the old tape in the drive and allow Bacula to continue to open the drive. However, you should probably set your poll time to 5 minutes so it doesn't wear the tape too much (I think that most modern tape drivers don't even re-read the tape. They simply cache the first block and keep returning it). 2. If you keep the "Offline on Unmount", you can probably prevent the failure by increasing the "Maximum Open Wait" to some large value. This will cause Bacula to continue to try to open the drive even if it fails. I this solution a bit less satisfactory than the above. I still have not run tests to see if the Polling is broken in 1.38, which is a possibility since the code that does the waiting was moved around and enhanced. My previous tests simulated your situation (no tape in the drive) and never got very far because the OS prevented the drive from being opened, and thus the polling code was never used. -- Best regards, Kern ("> /\ V_V ------------------------------------------------------- SF.Net email is sponsored by: Tame your development challenges with Apache's Geronimo App Server. Download it for free - -and be entered to win a 42" plasma tv or your very own Sony(tm)PSP. Click here to play: http://sourceforge.net/geronimo.php _______________________________________________ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users