Looks like the usual Firewall/Router dropping the connection after two hours, yes?
06-Jan 02:02 moregruel-dir: Start Backup JobId 4501, Job=Web-fury.2008-01-06_01.05.03 06-Jan 02:02 moregruel-dir: Recycled volume "Vol0033" 06-Jan 02:02 moregruel-sd: Recycled volume "Vol0033" on device "FileStorage" (/bacula), all previous data lost. 06-Jan 03:13 moregruel-dir: Volume used once. Marking Volume "Vol0033" as Used. 06-Jan 04:02 moregruel-dir: Web-fury.2008-01-06_01.05.03 Fatal error: Network error with FD during Backup: ERR=Connection reset by peer 06-Jan 04:02 moregruel-dir: Web-fury.2008-01-06_01.05.03 Fatal error: No Job status returned from FD. But: HeartbeatInterval is set to 10 minutes in both the relevant FD and the SD. And, it looks like all the data was sent to the SD in ~1:10, and then nothing for the next 50 minutes. The log also has this: Start time: 06-Jan-2008 02:02:52 End time: 06-Jan-2008 04:02:52 Elapsed time: 2 hours Priority: 10 FD Files Written: 0 SD Files Written: 5,705 FD Bytes Written: 0 (0 B) SD Bytes Written: 620,714,877 (620.7 MB) Other (smaller) jobs from the same client work. This job used to work. All parties are running 1.38.11 on Debian. The router (running dd-wrt) has a TCP timeout of 3600 seconds, but with the heartbeat, that shouldn't matter, and in any case doesn't match the observed times. Any suggestions? Regards, Steve -- Steve Greenland The irony is that Bill Gates claims to be making a stable operating system and Linus Torvalds claims to be trying to take over the world. -- seen on the net ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ _______________________________________________ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users