-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Kern Sibbald wrote: > On Sunday 29 July 2007 19:28, Ryan Novosielski wrote: >> Hi all, >> >> Ever since I added the TapeAlert/smartmonctl command to my tape drive, >> it appears as if I get a fairly regular crash of that bacula-sd. I know >> there is a case where Bacula and the utility can go for the tape drive >> at the same time and cause problems, but I don't think Bacula should go >> KABOOM when this happens. > > The traceback, unfortunately, doesn't demangle the C++ subroutine names nor > provide source line numbers, but the best that I can tell is that the heap > has been corrupted, Bacula detects is, then does a Kaboom (self inflicted seg > fault).
I'm not too good with development tools -- is there a reason this would be that is on my end (stripped binaries or something like that)? > Are you by any chance pointing the tapealert/smartmonctl at the tape drive > device rather than at the scsi control device? If you are, I am not > surprised, and you should remove it as two different programs cannot properly > exist using the same tape device. Yes. And I actually will remove that, but near as I can tell, Solaris does not have a way of addressing the tape drive as two different devices. From what I've read, the reason for this is that Solaris supposedly has the ability to do two actions on the device at once -- I'm not sure where I read that though in order to confirm it. I went looking for information about using the 'sgen' Solaris driver in order to instead use the control interface, but it appears as if sgen is only used to pick up devices that don't already have a type elsewhere; in other words, I could stop using 'st' and start using 'sgen', but that really wouldn't get me anywhere. Perhaps someone else will read about this and give me a pointer. > If you are pointing it at the scsi control device, I would be interested to > see what the normal output of the command gives back as there may be a > possible buffer overrun though that really should not happen. > > In any case, I recommend that you remove the tape alert for a time and see if > that eliminates the problem. I suspect it will, as it only showed up when I added it, near as I can tell. A KABOOM seems like something that ought not happen either way, though, although I suppose if something is corrupting buffers, it can't be avoided. Curious, though, as the tapealert often returns "Device busy" which would seem to mean that there's no change that the other thing using the device would actually have an error. >> This does not happen every day, but every once in awhile... it occurs at >> the end of a set of concurrent backups to tape -- all incrementals, 7 in >> total. By the time my catalog backup runs 2 hours later, the -sd has >> died and there is no connection made. >> >> The host machine is running Solaris 9, and the binaries are from >> BlastWave (currently version 2.0.3 with 2.0.2 clients, but until the day >> before yesterday, the admin/server machine was running 2.0.2 with >> identical results). I have not tried 2.1.x, but I would not be allowed >> to run a production schedule on a beta -- perhaps an exact copy on the >> same machine but writing to disk might yield the same results, but I >> suspect that this is caused by the TapeAlert, so maybe not. > > For a problem with tape alert, it is very unlikely that upgrading to 2.1.x > will help. > >> Thanks for any insights you can provide -- I'd be happy to report a bug >> if it is needed. > > Until I see your response and think about it, I don't think this is worth a > bug report, at least not just yet. OK, that is fine. If there's any easy way to try to get more information out of this thing, let me know. I actually had a fair amount of trouble getting this much in the first place -- if you run your bacula-dir as a non-root user as one really should, it then cannot run proper traces against daemons that run as root. I had to involve sudo; originally, I had no idea that this even ran by itself until I saw a number of empty traceback e-mails in root's box. - -- ---- _ _ _ _ ___ _ _ _ |Y#| | | |\/| | \ |\ | | |Ryan Novosielski - Systems Programmer II |$&| |__| | | |__/ | \| _| |[EMAIL PROTECTED] - 973/972.0922 (2-0922) \__/ Univ. of Med. and Dent.|IST/AST - NJMS Medical Science Bldg - C630 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.5 (MingW32) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFGrgPumb+gadEcsb4RAj8cAJ9dmkngZlOHQBByPaP7iTRC/Ex37QCg1rqM cUnnOqLvp4fdi51FyPoMY2s= =cDNA -----END PGP SIGNATURE----- ------------------------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now >> http://get.splunk.com/ _______________________________________________ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users