Good morning, we have bacula 1.38 running on some Debian/Linux 4.0 servers. We use sqlite3 as bacula catalog. Director (dir) and Storage Daemon (sd) are on the same server. Until recently, everything was running perfectly. Suddendly one of the backup fails with messages like this:
05-Nov 01:15 dir: Start Backup JobId 3275, Job=nvpop01.2007-11-05_01.03.11 05-Nov 01:15 sd: Spooling data ... 05-Nov 01:20 sd: User specified spool size reached. 05-Nov 01:20 sd: Writing spooled data to Volume. Despooling 2,000,050,353 bytes ... 05-Nov 01:21 sd: Spooling data again ... 05-Nov 01:25 sd: User specified spool size reached. 05-Nov 01:25 sd: Writing spooled data to Volume. Despooling 2,000,050,362 bytes ... 05-Nov 01:26 sd: Spooling data again ... 05-Nov 01:30 sd: User specified spool size reached. 05-Nov 01:30 sd: Writing spooled data to Volume. Despooling 2,000,050,334 bytes ... 05-Nov 01:31 sd: Spooling data again ... 05-Nov 01:33 sd: Committing spooled data to Volume "UTw0001". Despooling 1,408,516,846 bytes ... 05-Nov 01:34 sd: Sending spooled attrs to the Director. Despooling 17,901,905 bytes ... 05-Nov 03:15 dir: nvpop01.2007-11-05_01.03.11 Fatal error: Network error with FD during Backup: ERR=Connection reset by peer 05-Nov 03:15 dir: nvpop01.2007-11-05_01.03.11 Fatal error: No Job status returned from FD. 05-Nov 03:15 dir: nvpop01.2007-11-05_01.03.11 Error: Bacula 1.38.11 (28Jun06): 05-Nov-2007 03:15:44 JobId: 3275 Job: nvpop01.2007-11-05_01.03.11 Backup Level: Full Client: "nvpop01-fd" i486-pc-linux-gnu,debian,4.0 FileSet: "nvpop01FS" 2007-06-01 16:54:46 Pool: "UTweek" Storage: "sd" Scheduled time: 05-Nov-2007 01:03:10 Start time: 05-Nov-2007 01:15:44 End time: 05-Nov-2007 03:15:44 Elapsed time: 2 hours Priority: 10 FD Files Written: 0 SD Files Written: 49,024 FD Bytes Written: 0 (0 B) SD Bytes Written: 7,399,999,725 (7.399 GB) Rate: 0.0 KB/s Software Compression: None Volume name(s): UTw0001 Volume Session Id: 6 Volume Session Time: 1194198162 Last Volume Bytes: 190,824,602,912 (190.8 GB) Non-fatal FD errors: 0 SD Errors: 0 FD termination status: Error SD termination status: OK Termination: *** Backup Error *** The first thing that I noticed is that despooling attributes takes ages (more than data backup). In order to understand what's going on, I created a fake directory tree with 50K empty directory. With this setup I have little data to store but about 8MB of attributes to save (which is about half of the real backup that's troubling us). I can reproduce both the long attribute despooling time and the error. I tried to add Heartbeat interval but the Director and the Storage daemon confg file don't seem to like this option (I have a Bacula 2.0 manual, which states that I can put that option almost everywhere). The File Daemon instead liked it, but it didn't make any difference. The backup still fails. I see that the list of file is being sent to the catalog (if I list them with list files jobid=nnnn), but according to the mail report the backup failed. All other backup are running fine, but none of them has the same amount of attribute data. The same backup job runs fine if I set level to incremental. The amount of incremental attributes is 1.3MB, and it takes 9 minutes to despool them. So I know that after 9 minutes the FD is still there. I have set the heartbeat interval to 60 seconds, but as I said, to no avail. I think that the problem might be that despooling attributes takes too long and the FD closes connection before the director comes back to ask for job status, but I don't know how to keep the FD waiting. Did anybody experience this problem? How did he/she fixed it? Thank you very much ------------------------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now >> http://get.splunk.com/ _______________________________________________ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users