Hello folks,
my bacula-fd is dying after a successful incremental or full backup.
When the job "BackupCatalog" is run, bacula-fd is *not* dying.
My system: Actual unstable Debian, bacula & Co. 1.36.1-1 with mysql
backend.
I backup to DVDRAM, the directories or files which I backup are all
present, no files or directories are absent.
Doing a strace -p .. -f -s 1000 on the bacula-fd gets me the attached
bzipped2 output.
I thought, that there is something wrong with bsmtp, so I did the
following:
[/etc/bacula/bacula-dir.conf]
Messages {
Name = Standard
# mailcommand = "/usr/lib/bacula/bsmtp -h localhost -f \"\(Bacula\) %r\" -s \"Bacula:
%t %e of %c %l\" %r"
# operatorcommand = "/usr/lib/bacula/bsmtp -h localhost -f \"\(Bacula\) %r\" -s
\"Bacula: Intervention needed for %j\" %r"
mailcommand = "/bin/true -h localhost -f \"\(Bacula\) %r\" -s \"Bacula: %t %e of %c
%l\" %r"
operatorcommand = "/bin/true -h localhost -f \"\(Bacula\) %r\" -s \"Bacula:
Intervention needed for %j\" %r"
mail = [EMAIL PROTECTED] = all, !skipped
operator = [EMAIL PROTECTED] = mount
console = all, !skipped, !saved
append = "/var/lib/bacula/log" = all, !skipped
}
I thought bsmtp is the error, and I used /bin/true instead. The strace
above has been made with this config.
Searching for that error shows me this, cutted for readability:
[ps aux | grep bacula, only cmdlines]
bacula-console
/bin/sh /usr/sbin/btraceback /usr/sbin/bacula-fd 456
gdb -quiet -batch -x /usr/lib/bacula/btraceback.gdb /usr/sbin/bacula-fd 456
/usr/lib/bacula/bsmtp -h localhost -s Bacula traceback root
/usr/sbin/bacula-fd -c /etc/bacula/bacula-fd.conf
gdb -quiet -batch -x /usr/lib/bacula/btraceback.gdb /usr/sbi
Hmmm, sorry, for the chaos. I tried it again, with a different fileset.
Ooops, hier it works, I backup up /boot and /root/. So I looked again in
the strace-output, many many open(), I search the last one before the
btraceback or about when PID 32477 is dying.
/etc/localtime is opened, it links to /usr/share/zoneinfo/Europe/Berlin,
thats existing. But - it crashes then:
[bzcat /tmp/strace.bacula-fd2.bz2 | nl -ba | less]
37792 [pid 32477] write(1, "12-Mar 09:16 zeus-fd: ABORTING due to ERROR in
smartall.c:181\nqp->qnext->qprev != qp called fr om match.c:74\n", 108) = -1 EBADF
(Bad file descriptor)
37793 [pid 32477] --- SIGSEGV (Segmentation fault) @ 0 (0) ---
My normal fileset, here bacula-fd crashes:
/boot
/etc
/home/tomtom/.mutt
/home/tomtom/.procmail
/home/tomtom/.fluxbox
/home/tomtom/.fetchmailrc
/home/tomtom/files
/home/tomtom/bin
/home/tomtom/public_html
/home/uml/etc
/home/uml/bin
/root/bin
/root/firewall
/usr/share/vdr
/var/spool/cron
With this fileset it does not, it lives forever.
/boot
/etc
/home/tomtom/.mutt
/home/tomtom/.procmail
/home/tomtom/.fluxbox
/home/tomtom/.fetchmailrc
/home/tomtom/files
/home/tomtom/bin
/home/tomtom/public_html
/home/uml/ is a link to /mnt/storage/uml/. /root/bin/ is empty. And
/var/spool/cron/crontabs is this
drwx-wx--T 2 root crontab 4096 Oct 1 2001 /var/spool/cron/crontabs
The filesets are included in bacula-dir.conf with "<filset". I pasted it
directly in that file and restarted bacula-dir, the backup takes a
longer time than the other ones, I always backuped incrementally.
Whats going on here? Please do not hesitate to ask for further
questions... Sorry for the confusion, that's me that I am confused.
Geets,
Thomas