Hello, We are backing up around 130 boxes using bacula. We have a directory box with 4 backup servers and are backup up using Files on a 1 large raid array for each of these backup servers. Each night I gent dozens of failed Jobs. The errors are all similar to:
13-Aug 08:43 backup3-sd: unix3.2005-08-13_04.16.26 Fatal error: Device /backup/bacula is busy writing on another Volume. 13-Aug 08:59 unix3-fd: unix3.2005-08-13_04.16.26 Fatal error: job.c:1665 Bad response to Append Data command. Wanted 3000 OK data , got 3903 Error append data We have mix jobs for each storage device set to 3 in the director conf Max jobs on the storage daemons set to 20 (FreeBSD ports default). Max total jobs on the director set to 12 I have included some snippets from various conf files that I thought were relevant. If any other info is needed I will be happy top supply it. If one of these jobs fail it seams that waiting a few minutes and then restarting it will get things going again. It also seams that restarting the storage daemon at the beginning of the night, before backups start helps to minimize this. Sometimes these failed jobs appear to be running when checking the status of the storage daemons. Even hours after they have failed. Although there appears to be no actual transfer going on. This is a mix of FreeBSD 4.11 and 5.4 servers the director is 5.4 all other servers are a mix. It doesn’t seam to prefer failing on one version over another. Also, Is there an automated tool for deleting volumes from the storage server after they have been autopruned? I have a lot of old volumes taking up space I would likve to get rid of them. But there are hundreds. And I am lazy. === Relevant SD conf Storage { Name = backup2-sd SDPort = 9103 WorkingDirectory = "/var/db/bacula" Pid Directory = "/var/run" Maximum Concurrent Jobs = 20 } Device { Name = FileStorage Media Type = File Archive Device = /backup/bacula LabelMedia = yes; Random Access = Yes; AutomaticMount = yes; RemovableMedia = no; AlwaysOpen = yes; } ===== ===== Highlights from director conf Pool { Name = Default Pool Type = Backup Recycle = yes AutoPrune = yes Volume Retention = 8 days LabelFormat = "Vol" Maximum Volume Bytes = 4000000000 Accept Any Volume = yes } Storage { Name = backup1-sd Address = backup1 SDPort = 9103 Password = XXXXXXXX Device = FileStorage Maximum Concurrent Jobs = 3 Media Type = File } ===== James Ashton Vortech Inc -- No virus found in this outgoing message. Checked by AVG Anti-Virus. Version: 7.0.338 / Virus Database: 267.10.8/71 - Release Date: 8/12/2005 ------------------------------------------------------- SF.Net email is Sponsored by the Better Software Conference & EXPO September 19-22, 2005 * San Francisco, CA * Development Lifecycle Practices Agile & Plan-Driven Development * Managing Projects & Teams * Testing & QA Security * Process Improvement & Measurement * http://www.sqe.com/bsce5sf _______________________________________________ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users