In the message dated: Mon, 26 Sep 2011 16:28:23 +0200, The pithy ruminations from Jeremy Maes on <Re: [Bacula-users] Full backup fails after a few days with "Fatal error: Network error wi th FD during Backup: ERR=Interrupted system call"> were: => Op 26/09/2011 16:01, R. Leigh Hennig schreef: => > Morning, => > => > I have a client that whenever I try to do a full backup, after 6 days, => > the backup fails with this error: => > => > Fatal error: Network error with FD during Backup: ERR=Interrupted => > system call => > => > => > In bacula-dir.conf, for that job definition, I have this: => > => > Full Max Run Time = 1036800 => > => > So it should be able to run for up to 12 days, but after the 6th day, => > it's stopping. During that time it writes about 4.7 TB (with another 1 => > TB to go). Running CentOS 5.5 with Bacula 5.0.2. Any thoughts? => > => > => > Thanks, => > => Bacula has a hardcoded time limit on jobs of 6 days. Kern called it an => "insanity check" as any job that runs that long isn't really something => you'd want ...o
Wow. A virtually undocumented setting that causes a fatal error to long-running jobs. This may explain some failures that I've seen too. Thank you for responding, and for pulling the reference from the archive. I've been using bacula since 2006, but until recently we didn't have jobs that took that long to run. In the 4 years since this "feature" was mentioned, there's been an overall growth in data & backups. In our case at least 6-day+ jobs (while not ideal) are not good indication of an error, and should not be terminated. => => See => http://www.mail-archive.com/bacula-users@lists.sourceforge.net/msg20159.html => for a discussion on the mailing list from the past, and a pointer on => where to change the time limit in the code if you wish. Thanks for the reference. Seeing this from Kern makes me hesitate even more: take a looks at src/lib/watchdog.c -- someplace in that file there should be a tag that sets the timeout The "someplace" and "should be" really lend confidence if I need to start hacking the source code. => => Last time this was asked on the list someone pointed to a possible => configuration option to override the hardcoded limit that should've been => added by now, but given the 0 responses to that I can't say if it => actually exists. => => Regards, => Jeremy Mark ------------------------------------------------------------------------------ All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity and more. Splunk takes this data and makes sense of it. Business sense. IT sense. Common sense. http://p.sf.net/sfu/splunk-d2dcopy1 _______________________________________________ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users