In the message dated: Thu, 13 Oct 2011 11:54:47 +1100, The pithy ruminations from "James Harper" on <RE: [Bacula-users] seeking advice re. splitting up large backups -- dynamic filesets to p revent duplicate jobs and reduce backup time> were: => > => > In an effort to work around the fact that bacula kills long-running => jobs, I'm => > about to partition my backups into smaller sets. For example, instead => of => > backing up: => >
[SNIP!] => => Does Bacula really kill long running jobs? Or are you seeing the effect Yes. Bacula kills long running jobs. See the recent thread entitled: Full backup fails after a few days with "Fatal error: Network error with FD during Backup: ERR=Interrupted system call or see: http://www.mail-archive.com/bacula-users@lists.sourceforge.net/msg20159.html => of something at layer 3 or below (eg TCP connections timing out in => firewalls)? => => I think your dynamic fileset idea would break Bacula's 'Accurate Backup' => code. If you are not using Accurate then it might work but it still => seems like a lot of trouble to go to to solve this problem. => Yeah, it'll also break incremental and differential backups. I'm not going ahead with this plan, at least not in the current form. I really, really wish there was a way to prohibit multiple bacula jobs (of different names & filesets) that access the same client. I've logically split the multi-TB filesets into several bacula jobs. However, this means that they will run "in parallel" and place a significant load on the file & backup servers. I've created staggered schedules for full backups (ie., subset 1 is backed up on the 1st Wed of the month, subset 2 on the 2nd Wed, etc), but this won't help as they are 'new' jobs, and bacula will promote the initial incrementals to full backups. => If you limited the maximum jobs on the FD it would only run one at once, That doesn't work, as we backup ~20 small machines in addition to the large (4 to 8TB) filesystems. => but if the link was broken it might fail all the jobs. => => Another option would be a "Run After" to start the next job. Only the => first job would be scheduled, and it would run the next job in turn. => Then they would all just run in series. You could even take it a step => further and have the "Run After" script to retry the same job if it => failed due to a connection problem, and to give up after so many => retries. Maybe it could even start pinging the FD to see if it was => reachable (if backing up over an unreliable link is the problem you are => trying to solve). Not the problem. In fact, depending on the state of our HA cluster, the 'bacula' server may also be the 'file' server client. Thanks, Mark => => James => ------------------------------------------------------------------------------ All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity and more. Splunk takes this data and makes sense of it. Business sense. IT sense. Common sense. http://p.sf.net/sfu/splunk-d2d-oct _______________________________________________ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users