Hey guys, I think I found the bug... it looks like in autochanger.c at
line 130 there is a test to see if the loaded slot is what is wanted.
If not, it locks the drive, and loads the proper one. The problem is
the unlock doesn't come until line 177, which is after the conditional
block. If you move "unlock_changer(dcr)" to line 165, it won't be
unbalanced if the drive already had the right tape to begin with.
Hope this helps,
Cameron
Kern Sibbald wrote:
Hello,
This looks like a bug to me. You are getting a mutex (lock) failure probably
because a mutex is being cleared twice. Normally this doesn't happen and
FreeBSD passed all my regression scripts. However, I see that in your case,
Bacula did not recognize the tape as a Bacula labeled tape, so perhaps it
took some error path where this problem occurs.
I'd appreciate it if you would submit a bug report on this titled "SD crashes
with mutex failure" and include everything in your email.
Then, please make sure your binaries are not stripped by the installation
proces so that the debugging symbols will be available, and run the Storage
daemon under the debugger as described in the Kaboom chapter. When it dies,
please get a traceback (as described in the manual) and add it to the bug
report as a xxx.txt file. This should give me a better idea where the
problem is coming from.
In the mean time, I will see if I can duplicate this. If I can, I can fix it,
if not, we may need to turn on mutex trace code in the source to figure out
what is going wrong.
On Saturday 19 November 2005 07:02, Josh Endries wrote:
Hello,
I'm having trouble getting things working after upgrading to 1.38 on
my system. btape's normal and autochanger tests were all successful,
it's probably a FreeBSD issue but I don't know what. I have an
Exabyte VXA 1x10 2U autoloader.
19-Nov 00:14 backup-dir: Start Backup JobId 162,
Job=leto_Daily_Backup.2005-11-19_00.14.40
19-Nov 00:21 leto: DIR and FD clocks differ by 398 seconds, FD
automatically adjusting.
19-Nov 00:14 backup: 3301 Issuing autochanger "loaded drive 0" command.
19-Nov 00:14 backup: 3302 Autochanger "loaded drive 0", result is Slot 1.
*
\19-Nov 00:15 backup: 3307 Issuing autochanger "unload slot 1, drive
0" command.
19-Nov 00:15 backup: 3304 Issuing autochanger "load slot 6, drive 0"
command.
*\
19-Nov 00:16 backup: 3305 Autochanger "load slot 6, drive 0", status
is OK.
19-Nov 00:16 backup: 3301 Issuing autochanger "loaded drive 0" command.
19-Nov 00:16 backup: 3302 Autochanger "loaded drive 0", result is Slot 6.
*
19-Nov 00:18 backup: 3301 Issuing autochanger "loaded drive 0" command.
19-Nov 00:18 backup: 3302 Autochanger "loaded drive 0", result is Slot 6.
19-Nov 00:25 leto: leto_Daily_Backup.2005-11-19_00.14.40 Fatal error:
job.c:1594 Comm error with SD. bad response to Append Data.
ERR=Broken pipe
19-Nov 00:18 backup-dir: leto_Daily_Backup.2005-11-19_00.14.40 Error:
Bacula 1.38.1 (14Nov05): 19-Nov-2005 00:18:47
JobId: 162
Job: leto_Daily_Backup.2005-11-19_00.14.40
Backup Level: Differential, since=2005-09-17 01:00:04
Client: "leto"
i386-portbld-freebsd5.4,freebsd,5.4-STABLE
FileSet: "leto" 2005-09-13 20:26:12
Pool: "Weekly"
Storage: "Exabyte VXA AutoPak 1x10"
Scheduled time: 19-Nov-2005 00:14:36
Start time: 19-Nov-2005 00:14:42
End time: 19-Nov-2005 00:18:47
Priority: 1
FD Files Written: 0
SD Files Written: 0
FD Bytes Written: 0
SD Bytes Written: 0
Rate: 0.0 KB/s
Software Compression: None
Volume name(s):
Volume Session Id: 1
Volume Session Time: 1132376988
Last Volume Bytes: 241,210,545
Non-fatal FD errors: 0
SD Errors: 0
FD termination status: Error
SD termination status: Error
Termination: *** Backup Error ***
19-Nov 00:18 backup-dir: message.c:454 Mail prog: bsmtp: bsmtp.c:277
Fatal connect error to localhost: ERR=Connection refused
19-Nov 00:18 backup-dir: leto_Daily_Backup.2005-11-19_00.14.40 Error:
message.c:465 Mail program terminated in error.
CMD=/usr/local/sbin/bsmtp -h localhost -f "(Bacula) [EMAIL PROTECTED]"
-s "Bacula: Backup Fatal Error of leto Differential" [EMAIL PROTECTED]
ERR=Child exited with code 1
The SD process actually dies during this process (at 00:25). This is
SD'd output:
backup: cram-md5.c:52 send: auth cram-md5
<[EMAIL PROTECTED]> ssl=0
backup: cram-md5.c:68 Authenticate OK 7jUWknxuD5xCh/seH//vIA
backup: cram-md5.c:114 sending resp to challenge: r+/jcih9CE+XEzwlz9+fzA
backup: dircmd.c:187 Message channel init completed.
backup: cram-md5.c:52 send: auth cram-md5
<[EMAIL PROTECTED]> ssl=0
backup: cram-md5.c:68 Authenticate OK X4+uVxsZG4+sd3+vmzU3ZC
backup: cram-md5.c:114 sending resp to challenge: 5R+GJxtYK6+fLw+Rnm/47D
backup: autochanger.c:217 run_prog:
/usr/local/share/bacula/my-changer /dev/pass0 loaded 6 /dev/nsa0 0
stat=0 result=1
backup: device.c:326 Force close_dev "Exabyte VXA AutoPak 1x10 drive
1" (/dev/nsa0)
backup: dev.c:1702 really close_dev "Exabyte VXA AutoPak 1x10 drive
1" (/dev/nsa0)
backup: dvd.c:93 Enter unmount_dev
mt: /dev/nsa0: Device not configured
Unloading Data Transfer Element into Storage Element 1...done
backup: dev.c:277 open dev: tape=2 dev_name="Exabyte VXA AutoPak 1x10
drive 1" (/dev/nsa0) vol=weekly-1-1 mode=OPEN_READ_WRITE
backup: dev.c:335 open dev: device is tape
backup: autochanger.c:217 run_prog:
/usr/local/share/bacula/my-changer /dev/pass0 loaded 6 /dev/nsa0 0
stat=0 result=6
backup: dev.c:410 open dev: tape 6 opened
backup: dev.c:655 rewind_dev fd=6 "Exabyte VXA AutoPak 1x10 drive 1"
(/dev/nsa0)
backup: label.c:138 Big if statement in read_volume_label
backup: label.c:150 Could not unserialize Volume label:
ERR=label.c:720 Expecting Volume Label, got FI=0 Stream=0 len=64412
backup: dev.c:655 rewind_dev fd=6 "Exabyte VXA AutoPak 1x10 drive 1"
(/dev/nsa0)
backup: autochanger.c:217 run_prog:
/usr/local/share/bacula/my-changer /dev/pass0 loaded 6 /dev/nsa0 0
stat=0 result=6
19-Nov 00:18 backup: ABORTING due to ERROR in bsys.c:382
Mutex unlock failure. ERR=Operation not permitted
Kaboom! bacula-sd, backup got signal 11. Attempting traceback.
Kaboom! exepath=/usr/local/sbin/
Calling: /usr/local/sbin/btraceback /usr/local/sbin/bacula-sd 30645
execv: /usr/local/sbin/btraceback failed: ERR=Permission denied
I've checked the file permissions on the tape device files and Bacula
should have RW access (and btape worked). My changer script
(mtx-changer with offline enabled) works...the "mutex unlock failure"
and "device not configured" make me think it's a problem with
FreeBSD's tape stuff, but I dunno what it could be. Maybe a hold over
from when I upgraded, though I thought I "upgraded" the config files.
Using chio doesn't change anything, I get the same errors. Anyone out
there know what this might be? The only thing I've found online about
these errors has to do with the drive being empty when trying to
control it, but there's no difference if I start the process with a
tape in the drive or not...
bacula-sd.conf:
storage {
Name = "backup"
WorkingDirectory = "/var/db/bacula"
Pid Directory = "/var/run"
Maximum Concurrent Jobs = 30
}
autochanger {
name = "Exabyte VXA AutoPak 1x10"
changer device = /dev/pass0
changer command = "/usr/local/share/bacula/my-changer %c %o
%S %a %d"
device = "Exabyte VXA AutoPak 1x10 drive 1"
}
device {
name = "Exabyte VXA AutoPak 1x10 drive 1"
archive device = /dev/nsa0
media type = "VXA-1"
autochanger = yes
automatic mount = yes
always open = yes
removable media = yes
random access = no
spool directory = "/usr/bacula"
maximum spool size = 10gb
label media = no
hardware end of file = no
hardware end of medium = no
offline on unmount = yes
bsf at eom = yes
two eof = yes
backward space record = no
backward space file = no
fast forward space file = no;
}
Thanks,
Josh
-------------------------------------------------------
This SF.Net email is sponsored by the JBoss Inc. Get Certified Today
Register for a JBoss Training Course. Free Certification Exam
for All Training Attendees Through End of 2005. For more info visit:
http://ads.osdn.com/?ad_id=7628&alloc_id=16845&op=click
_______________________________________________
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users
-------------------------------------------------------
This SF.Net email is sponsored by the JBoss Inc. Get Certified Today
Register for a JBoss Training Course. Free Certification Exam
for All Training Attendees Through End of 2005. For more info visit:
http://ads.osdn.com/?ad_id=7628&alloc_id=16845&op=click
_______________________________________________
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users