Arno Lehmann schrieb: > Perhaps a hardware-related problem? Have you had a look into the > system's log files?
Didn't find anything related in the logs. > > Then the sd died: > > Now that's bad... even in case of a seriuos problem the SD shouldn't die. > > > 07-Okt 00:19 VU0EA003-sd: ABORTING due to ERROR in dev.c:724 > > dev.c:723 Bad call to rewind. Device "ULTRIUM-TD4-D3" > > (/dev/ULTRIUM-TD4-D3) not open > > Kaboom! bacula-sd, VU0EA003-sd got signal 11 - Segmentation violation. > > Attempting traceback. > > Kaboom! exepath=/usr/sbin/ > > Calling: /usr/sbin/btraceback /usr/sbin/bacula-sd 15802 > > > > > > http://www.bacula.org/en/dev-manual/What_Do_When_Bacula.html > > > > gdb is installed but bacula-sd is not running as root, maybe that was > > the reason why I got no traceback by mail. > > Possible... I believe gdb needs to run as root in some circumstances, > but that's definitely not my field of expertise :-) I've changed the btraceback file to be suid root. Not the best way, but this is not a multi user machine. > > > > Anyway, I've seen this 'not ready, retrying...' problem only once > > 5 months ago. There is nothing in the system logs or the changer > > logfile when it happens. > > > > Any ideas what I've to do to prevent bacula from crash at that point? > > No, but a suggestion. > > > I've changed the mtx-changer script to wait a bit longer: > > > > wait_for_drive() { > > i=0 > > while [ $i -le 50 ]; do # Wait max 1000 seconds > > if mt -f $1 status | grep "${ready}" >/dev/null 2>&1; then > > break > > fi > > debug "Device $1 - not ready, retrying..." > > sleep 1 > > i=`expr $i + 20` > > That should be $+ +1 - now you're running the loop with 0, 20, 40, 60 > and the fourth iteration is already more than 50. > > So the retries shown in the log excerpt above would be because of > Bacula's attempts to run the script, not inside the script. > > > done > > } err, I think this is what I wanted: wait_for_drive() { i=0 while [ $i -le 1000 ]; do # Wait max 1000 seconds if mt -f $1 status | grep "${ready}" >/dev/null 2>&1; then break fi debug "Device $1 - not ready, retrying..." sleep 30 i=`expr $i + 30` done } Increase the wait time to 1000s in 30s steps. > > I've no idea what the drive was doing during the 15 minutes this night... > > I haven't, either. Just my observation above. Sorry. Well, let's see if it happens again and if the longer wait_time prevents the sd from dying. Ralf ------------------------------------------------------------------------- This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK & win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100&url=/ _______________________________________________ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users