On Sunday 20 November 2005 10:24, Cameron Murphy wrote: > Hey guys, I think I found the bug... it looks like in autochanger.c at > line 130 there is a test to see if the loaded slot is what is wanted. > If not, it locks the drive, and loads the proper one. The problem is > the unlock doesn't come until line 177, which is after the conditional > block. If you move "unlock_changer(dcr)" to line 165, it won't be > unbalanced if the drive already had the right tape to begin with. > > Hope this helps,
Yes, this is definitely a bug, and your suggested solution is correct. Now if I just had an autochanger on a FreeBSD system, the autochanger regression script would have caught this, or maybe there is some way I can make Linux default to the strict checking done in FreeBSD. I'll post a patch for this shortly ... Thanks, Kern > > Cameron > > Kern Sibbald wrote: > >Hello, > > > >This looks like a bug to me. You are getting a mutex (lock) failure > > probably because a mutex is being cleared twice. Normally this doesn't > > happen and FreeBSD passed all my regression scripts. However, I see that > > in your case, Bacula did not recognize the tape as a Bacula labeled tape, > > so perhaps it took some error path where this problem occurs. > > > >I'd appreciate it if you would submit a bug report on this titled "SD > > crashes with mutex failure" and include everything in your email. > > > >Then, please make sure your binaries are not stripped by the installation > >proces so that the debugging symbols will be available, and run the > > Storage daemon under the debugger as described in the Kaboom chapter. > > When it dies, please get a traceback (as described in the manual) and add > > it to the bug report as a xxx.txt file. This should give me a better > > idea where the problem is coming from. > > > >In the mean time, I will see if I can duplicate this. If I can, I can fix > > it, if not, we may need to turn on mutex trace code in the source to > > figure out what is going wrong. > > > >On Saturday 19 November 2005 07:02, Josh Endries wrote: > >>Hello, > >> > >>I'm having trouble getting things working after upgrading to 1.38 on > >>my system. btape's normal and autochanger tests were all successful, > >>it's probably a FreeBSD issue but I don't know what. I have an > >>Exabyte VXA 1x10 2U autoloader. > >> > >>19-Nov 00:14 backup-dir: Start Backup JobId 162, > >>Job=leto_Daily_Backup.2005-11-19_00.14.40 > >>19-Nov 00:21 leto: DIR and FD clocks differ by 398 seconds, FD > >>automatically adjusting. > >>19-Nov 00:14 backup: 3301 Issuing autochanger "loaded drive 0" command. > >>19-Nov 00:14 backup: 3302 Autochanger "loaded drive 0", result is Slot 1. > >>* > >>\19-Nov 00:15 backup: 3307 Issuing autochanger "unload slot 1, drive > >>0" command. > >>19-Nov 00:15 backup: 3304 Issuing autochanger "load slot 6, drive 0" > >>command. > >>*\ > >>19-Nov 00:16 backup: 3305 Autochanger "load slot 6, drive 0", status > >>is OK. > >>19-Nov 00:16 backup: 3301 Issuing autochanger "loaded drive 0" command. > >>19-Nov 00:16 backup: 3302 Autochanger "loaded drive 0", result is Slot 6. > >>* > >>19-Nov 00:18 backup: 3301 Issuing autochanger "loaded drive 0" command. > >>19-Nov 00:18 backup: 3302 Autochanger "loaded drive 0", result is Slot 6. > >>19-Nov 00:25 leto: leto_Daily_Backup.2005-11-19_00.14.40 Fatal error: > >>job.c:1594 Comm error with SD. bad response to Append Data. > >>ERR=Broken pipe > >>19-Nov 00:18 backup-dir: leto_Daily_Backup.2005-11-19_00.14.40 Error: > >>Bacula 1.38.1 (14Nov05): 19-Nov-2005 00:18:47 > >> JobId: 162 > >> Job: leto_Daily_Backup.2005-11-19_00.14.40 > >> Backup Level: Differential, since=2005-09-17 01:00:04 > >> Client: "leto" > >>i386-portbld-freebsd5.4,freebsd,5.4-STABLE > >> FileSet: "leto" 2005-09-13 20:26:12 > >> Pool: "Weekly" > >> Storage: "Exabyte VXA AutoPak 1x10" > >> Scheduled time: 19-Nov-2005 00:14:36 > >> Start time: 19-Nov-2005 00:14:42 > >> End time: 19-Nov-2005 00:18:47 > >> Priority: 1 > >> FD Files Written: 0 > >> SD Files Written: 0 > >> FD Bytes Written: 0 > >> SD Bytes Written: 0 > >> Rate: 0.0 KB/s > >> Software Compression: None > >> Volume name(s): > >> Volume Session Id: 1 > >> Volume Session Time: 1132376988 > >> Last Volume Bytes: 241,210,545 > >> Non-fatal FD errors: 0 > >> SD Errors: 0 > >> FD termination status: Error > >> SD termination status: Error > >> Termination: *** Backup Error *** > >> > >>19-Nov 00:18 backup-dir: message.c:454 Mail prog: bsmtp: bsmtp.c:277 > >>Fatal connect error to localhost: ERR=Connection refused > >>19-Nov 00:18 backup-dir: leto_Daily_Backup.2005-11-19_00.14.40 Error: > >>message.c:465 Mail program terminated in error. > >>CMD=/usr/local/sbin/bsmtp -h localhost -f "(Bacula) [EMAIL PROTECTED]" > >>-s "Bacula: Backup Fatal Error of leto Differential" [EMAIL PROTECTED] > >>ERR=Child exited with code 1 > >> > >>The SD process actually dies during this process (at 00:25). This is > >>SD'd output: > >> > >>backup: cram-md5.c:52 send: auth cram-md5 > >><[EMAIL PROTECTED]> ssl=0 > >>backup: cram-md5.c:68 Authenticate OK 7jUWknxuD5xCh/seH//vIA > >>backup: cram-md5.c:114 sending resp to challenge: r+/jcih9CE+XEzwlz9+fzA > >>backup: dircmd.c:187 Message channel init completed. > >>backup: cram-md5.c:52 send: auth cram-md5 > >><[EMAIL PROTECTED]> ssl=0 > >>backup: cram-md5.c:68 Authenticate OK X4+uVxsZG4+sd3+vmzU3ZC > >>backup: cram-md5.c:114 sending resp to challenge: 5R+GJxtYK6+fLw+Rnm/47D > >>backup: autochanger.c:217 run_prog: > >>/usr/local/share/bacula/my-changer /dev/pass0 loaded 6 /dev/nsa0 0 > >>stat=0 result=1 > >> > >>backup: device.c:326 Force close_dev "Exabyte VXA AutoPak 1x10 drive > >>1" (/dev/nsa0) > >>backup: dev.c:1702 really close_dev "Exabyte VXA AutoPak 1x10 drive > >>1" (/dev/nsa0) > >>backup: dvd.c:93 Enter unmount_dev > >>mt: /dev/nsa0: Device not configured > >>Unloading Data Transfer Element into Storage Element 1...done > >>backup: dev.c:277 open dev: tape=2 dev_name="Exabyte VXA AutoPak 1x10 > >>drive 1" (/dev/nsa0) vol=weekly-1-1 mode=OPEN_READ_WRITE > >>backup: dev.c:335 open dev: device is tape > >>backup: autochanger.c:217 run_prog: > >>/usr/local/share/bacula/my-changer /dev/pass0 loaded 6 /dev/nsa0 0 > >>stat=0 result=6 > >> > >>backup: dev.c:410 open dev: tape 6 opened > >>backup: dev.c:655 rewind_dev fd=6 "Exabyte VXA AutoPak 1x10 drive 1" > >>(/dev/nsa0) > >>backup: label.c:138 Big if statement in read_volume_label > >>backup: label.c:150 Could not unserialize Volume label: > >>ERR=label.c:720 Expecting Volume Label, got FI=0 Stream=0 len=64412 > >> > >>backup: dev.c:655 rewind_dev fd=6 "Exabyte VXA AutoPak 1x10 drive 1" > >>(/dev/nsa0) > >>backup: autochanger.c:217 run_prog: > >>/usr/local/share/bacula/my-changer /dev/pass0 loaded 6 /dev/nsa0 0 > >>stat=0 result=6 > >> > >>19-Nov 00:18 backup: ABORTING due to ERROR in bsys.c:382 > >>Mutex unlock failure. ERR=Operation not permitted > >>Kaboom! bacula-sd, backup got signal 11. Attempting traceback. > >>Kaboom! exepath=/usr/local/sbin/ > >>Calling: /usr/local/sbin/btraceback /usr/local/sbin/bacula-sd 30645 > >>execv: /usr/local/sbin/btraceback failed: ERR=Permission denied > >> > >>I've checked the file permissions on the tape device files and Bacula > >>should have RW access (and btape worked). My changer script > >>(mtx-changer with offline enabled) works...the "mutex unlock failure" > >>and "device not configured" make me think it's a problem with > >>FreeBSD's tape stuff, but I dunno what it could be. Maybe a hold over > >>from when I upgraded, though I thought I "upgraded" the config files. > >>Using chio doesn't change anything, I get the same errors. Anyone out > >>there know what this might be? The only thing I've found online about > >>these errors has to do with the drive being empty when trying to > >>control it, but there's no difference if I start the process with a > >>tape in the drive or not... > >> > >>bacula-sd.conf: > >>storage { > >> Name = "backup" > >> WorkingDirectory = "/var/db/bacula" > >> Pid Directory = "/var/run" > >> Maximum Concurrent Jobs = 30 > >>} > >>autochanger { > >> name = "Exabyte VXA AutoPak 1x10" > >> changer device = /dev/pass0 > >> changer command = "/usr/local/share/bacula/my-changer %c %o > >>%S %a %d" > >> device = "Exabyte VXA AutoPak 1x10 drive 1" > >>} > >>device { > >> name = "Exabyte VXA AutoPak 1x10 drive 1" > >> archive device = /dev/nsa0 > >> media type = "VXA-1" > >> autochanger = yes > >> automatic mount = yes > >> always open = yes > >> removable media = yes > >> random access = no > >> spool directory = "/usr/bacula" > >> maximum spool size = 10gb > >> label media = no > >> hardware end of file = no > >> hardware end of medium = no > >> offline on unmount = yes > >> bsf at eom = yes > >> two eof = yes > >> backward space record = no > >> backward space file = no > >> fast forward space file = no; > >>} > >> > >>Thanks, > >>Josh > >> > >> > >>------------------------------------------------------- > >>This SF.Net email is sponsored by the JBoss Inc. Get Certified Today > >>Register for a JBoss Training Course. Free Certification Exam > >>for All Training Attendees Through End of 2005. For more info visit: > >>http://ads.osdn.com/?ad_id=7628&alloc_id=16845&op=click > >>_______________________________________________ > >>Bacula-users mailing list > >>Bacula-users@lists.sourceforge.net > >>https://lists.sourceforge.net/lists/listinfo/bacula-users ------------------------------------------------------- This SF.Net email is sponsored by the JBoss Inc. Get Certified Today Register for a JBoss Training Course. Free Certification Exam for All Training Attendees Through End of 2005. For more info visit: http://ads.osdn.com/?ad_id=7628&alloc_id=16845&op=click _______________________________________________ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users