Hey, all. We're seeing one of our Bacula machines acting up in a very strange manner. The below error occurs when a one of our machines tries to back itself up. It's a Red Hat Enterprise Linux machine. Initially, it was running Bacula 1.36.1. I upgraded to 1.38.3 in the hopes that this was a bug that had already been fixed, but we're seeing the same results under 1.38.3. I've also tried both full and incremental backups - both behave the same.
The director, storage daemon, and file daemon are all on the same machine. There are no iptables rules whatsoever on the machine, aside from the default-to-accept rules. When I do a 'status client' from the console while the job is running, it's always stuck very early on: Running Jobs: JobId 440 Job server.2006-01-05_16.36.41 is running. Backup Job started: 05-Jan-06 16:36 Files=497 Bytes=16,452 Bytes/sec=22 Files Examined=505 Processing file: /dev/cciss/c0d6 SDReadSeqNo=5 fd=7 Director connected at: 05-Jan-06 16:48 Eventually, the job times out. The end of job summary looks like this: 05-Jan 16:54 server-fd: server.2006-01-05_16.36.41 Fatal error: backup.c:654 Network send error to SD. ERR=Connection timed out 05-Jan 16:55 server-dir: server.2006-01-05_16.36.41 Error: Bacula 1.38.3 (04Jan06): 05-Jan-2006 16:55:01 JobId: 440 Job: server.2006-01-05_16.36.41 Backup Level: Full Client: "server-fd" i686-pc-linux-gnu,redhat,Enterprise release FileSet: "Full Set" 2005-08-18 13:44:50 Pool: "Default" Storage: "File" Scheduled time: 05-Jan-2006 16:36:35 Start time: 05-Jan-2006 16:36:43 End time: 05-Jan-2006 16:55:01 Priority: 10 FD Files Written: 497 SD Files Written: 0 FD Bytes Written: 16,452 SD Bytes Written: 0 Rate: 0.0 KB/s Software Compression: None Volume name(s): Volume Session Id: 2 Volume Session Time: 1136496459 Last Volume Bytes: 85,443 Non-fatal FD errors: 0 SD Errors: 0 FD termination status: Error SD termination status: Running Termination: *** Backup Error *** Another interesting point is that even after we get this error message, a 'status storage' shows that the SD thinks the job is still running. Running Jobs: Writing: Full Backup job server JobId=440 Volume="server_ide_0038" pool="Default" device=""FileStorage" (/backup)" Files=9 Bytes=631 Bytes/sec=0 FDReadSeqNo=35 in_msg=25 out_msg=5 fd=7 Any insight on what could cause a problem like this (or suggestions on how to fix it =) ) would be greatly appreciated. Thanks! -Brian ------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Do you grep through log files for problems? Stop! Download the new AJAX search engine that makes searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! http://ads.osdn.com/?ad_id=7637&alloc_id=16865&op=click _______________________________________________ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users