Hello,
 We are backing up around 130 boxes using bacula. We have a directory box with 
4 backup servers and are backup up using Files on a 1 large raid array for each 
of these backup servers.
 Each night I gent dozens of failed Jobs. The errors are all similar to:

13-Aug 08:43 backup3-sd: unix3.2005-08-13_04.16.26 Fatal error: Device 
/backup/bacula is busy writing on another Volume.
13-Aug 08:59 unix3-fd: unix3.2005-08-13_04.16.26 Fatal error: job.c:1665 Bad 
response to Append Data command. Wanted 3000 OK data
, got 3903 Error append data 

We have mix jobs for each storage device set to 3 in the director conf
Max jobs on the storage daemons set to 20 (FreeBSD ports default).
Max total jobs on the director set to 12

I have included some snippets from various conf files that I thought were 
relevant. If any other info is needed I will be happy top supply it.

If one of these jobs fail it seams that waiting a few minutes and then 
restarting it will get things going again.

It also seams that restarting the storage daemon at the beginning of the night, 
before backups start helps to minimize this.

Sometimes these failed jobs appear to be running when checking the status of 
the storage daemons. Even hours after they have failed. Although there appears 
to be no actual transfer going on.

This is a mix of FreeBSD 4.11 and 5.4 servers the director is 5.4 all other 
servers are a mix. It doesn’t seam to prefer failing on one version over 
another. 

Also, Is there an automated tool for deleting volumes from the storage server 
after they have been autopruned? I have a lot of old volumes taking up space I 
would likve to get rid of them. But there are hundreds. And I am lazy.


===
Relevant SD conf

Storage {                          
  Name = backup2-sd
  SDPort = 9103                  
  WorkingDirectory = "/var/db/bacula"
  Pid Directory = "/var/run"
  Maximum Concurrent Jobs = 20
} 


Device {
  Name = FileStorage
  Media Type = File
  Archive Device = /backup/bacula
  LabelMedia = yes;                   
  Random Access = Yes;
  AutomaticMount = yes;               
  RemovableMedia = no;
  AlwaysOpen = yes;
}

=====

=====
Highlights from director conf

Pool {
    Name = Default
    Pool Type = Backup
    Recycle = yes                       
    AutoPrune = yes                     
    Volume Retention = 8 days
    LabelFormat = "Vol"
    Maximum Volume Bytes = 4000000000
    Accept Any Volume = yes

}

Storage {
      Name = backup1-sd
      Address = backup1
      SDPort = 9103
      Password = XXXXXXXX
      Device = FileStorage
      Maximum Concurrent Jobs = 3
      Media Type = File
}

=====






James Ashton
Vortech Inc



-- 
No virus found in this outgoing message.
Checked by AVG Anti-Virus.
Version: 7.0.338 / Virus Database: 267.10.8/71 - Release Date: 8/12/2005
 


-------------------------------------------------------
SF.Net email is Sponsored by the Better Software Conference & EXPO
September 19-22, 2005 * San Francisco, CA * Development Lifecycle Practices
Agile & Plan-Driven Development * Managing Projects & Teams * Testing & QA
Security * Process Improvement & Measurement * http://www.sqe.com/bsce5sf
_______________________________________________
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

Reply via email to