Hi, I'm running Bacula 5.0.2 in conjunction with PostgreSQL 8.4.3 (both compiled from the "official" sources) on a Debian Lenny system. The system runs the director, storage daemon and file daemon. The actual storage media are barcode labeled LTO3 tapes contained in a HP Storage Works 1/8 G2 autoloader equipped with an HP Ultrium 920 drive.
I've already followed the instructions in http://tldp.org/HOWTO/html_single/TCP-Keepalive-HOWTO/ and the corresponding FAQ entry http://wiki.bacula.org/doku.php?id=faq#my_backup_starts_but_dies_after_a_while_with_connection_reset_by_peer_error since I came accross this thread, which had been started before I subscribed to the Bacula users mailing list: http://adsm.org/lists/html/Bacula-users/2010-04/msg00172.html I decided to compile the libkeeplive from Source Forge and use the LD_PRELOAD mechanism in order to make sure libkeeplive.so is always loaded. Furthermore I changed the sysctl and Heartbeat Interval settings as described in the referenced FAQ entry. Disabling accurate backups is not a solution for me since I want deleted files to be taken into account and I have to run quite a few long running shell scripts for gathering data via SCP from other hosts before the actual backup (writing to tape via SD) starts. So, a workaround for the TCP keepalive problem is absolutely necessary for me. A status inquiry on the client from within bconsole works without a problem, but even with TCP keepalive enabled my backup stops after a very short period of time, as the log from the Bacula director shows: === 1-Mai 16:33 nathan-sd JobId 13: Job write elapsed time = 00:00:36, Transfer rate = 20.30 M Bytes/second 11-Mai 16:34 nathan-fd JobId 13: Fatal error: backup.c:1019 Network send error to SD. ERR=Connection reset by peer 11-Mai 16:34 nathan-dir JobId 13: Error: Bacula nathan-dir 5.0.2 (28Apr10): 11-Mai-2010 16:34:27 Build OS: x86_64-unknown-linux-gnu debian 5.0.4 JobId: 13 Job: nathan_backup.2010-05-11_16.32.43_03 Backup Level: Full (upgraded from Incremental) Client: "nathan" 5.0.2 (28Apr10) x86_64-unknown-linux-gnu,debian,5.0.4 FileSet: "nathan fileset" 2010-05-10 23:05:00 Pool: "WeeklyBackups" (From Job FullPool override) Catalog: "MyCatalog" (From Client resource) Storage: "nathan-sd" (From Pool resource) Scheduled time: 11-Mai-2010 16:32:41 Start time: 11-Mai-2010 16:32:45 End time: 11-Mai-2010 16:34:27 Elapsed time: 1 min 42 secs Priority: 10 FD Files Written: 4,072 SD Files Written: 4,059 FD Bytes Written: 734,948,632 (734.9 MB) SD Bytes Written: 731,138,860 (731.1 MB) Rate: 7205.4 KB/s Software Compression: None VSS: no Encryption: no Accurate: yes Volume name(s): Volume Session Id: 1 Volume Session Time: 1273588341 Last Volume Bytes: 2,194,827,264 (2.194 GB) Non-fatal FD errors: 0 SD Errors: 0 FD termination status: Error SD termination status: Error Termination: *** Backup Error *** === Any help will be greatly appreciated. Thanks in advance & kind regards, Holger
signature.asc
Description: Digital signature
------------------------------------------------------------------------------
_______________________________________________ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users