Hi, 25.07.2007 01:51,, Support wrote:: > Dear All > > The major error seems to be > > 24-Jul 14:37 elizabeth-dir: civeng54.2007-07-24_09.05.01 Error: open mail > pipe /usr/sbin/bsmtp -h localhost -f "(Bacula) [EMAIL PROTECTED]" -s "Bacula: > Backup Fatal Error of civeng54 Full" [EMAIL PROTECTED] failed: ERR=Cannot > allocate memory > > Several errors like this occur with subsequent queued jobs - then the > daemon seems to die. > > Is there a memory leak?
I'm not sure... I had to struggle with memory exhaustion both at my own installation and at customer sites, and even though I could never actually prove a memory leak, I suspect that the database libraries for MySQL sometimes use lots of memory without properly releasing it again. Reasons for my assumptions: (note that this is just a collection of observations, not real debugging) - The problem happened only when using MySQL. - It doesn't matter if the database runs on the same machine as the DIR or remotely. - Once no jobs are active in the DIR, memory consumption goes down again. - For this problem, even jobs that are waiting for resources are active. - The more jobs you run simultaneously, the more likely the memory gets exhausted. My work around so far is to ru n the jobs in non-overlapping bunches, i.e. instead of starting all 40 jobs at once, I run 13, 14, and 13 with different schedules so that the DIR is idle between these bunches. > I have had bacula do 800+ jobs with out an error until having this type of > problem. Perhaps, as jobs take longer because the data volume increases, the underlying problem becomes more significant. > I have had this occur twice in recent weeks. While a client is being > backed up (a laptop) and the user has disconnected it without stopping the > file daemon. > > This results is the job now relies on the default TCP timeout of 2 hours. > > Could the director monitor the client and if it is not responding in NN > minutes / hours terminate the job. You cannot cancel a job if the client > is not "there". > > What seems to happed is this kills the director (2.0.3) and I have had to > restart bacula. > > See below - a prior full backup of the client took 2 hrs for 30 GB > > Any ideas / fixes. See aboce for my work around. Or upgrade to the latest beta, IIRC some catalog database memory problems are fixed there. Arno > Thanks > Stephen Carr > > 24-Jul 14:37 elizabeth-dir: civeng54.2007-07-24_09.05.01 Fatal error: > Network error with FD during Backup: ERR=Connection reset by peer > 24-Jul 14:37 elizabeth-dir: civeng54.2007-07-24_09.05.01 Fatal error: No > Job status returned from FD. > 24-Jul 14:37 elizabeth-dir: civeng54.2007-07-24_09.05.01 Error: Bacula > 2.0.3 (06Mar07): 24-Jul-2007 14:37:50 > JobId: 26426 > Job: civeng54.2007-07-24_09.05.01 > Backup Level: Full > Client: "civeng54" Windows XP,MVS,NT 5.1.2600 > FileSet: "workstation" 2006-09-07 16:00:03 > Pool: "Migrate-Full" (From Job FullPool override) > Storage: "File" (From Pool resource) > Scheduled time: 24-Jul-2007 09:05:00 > Start time: 24-Jul-2007 10:37:39 > End time: 24-Jul-2007 14:37:50 > Elapsed time: 4 hours 11 secs > Priority: 10 > FD Files Written: 0 > SD Files Written: 0 > FD Bytes Written: 0 (0 B) > SD Bytes Written: 0 (0 B) > Rate: 0.0 KB/s > Software Compression: None > VSS: no > Encryption: no > Volume name(s): Full0005 > Volume Session Id: 298 > Volume Session Time: 1184798524 > Last Volume Bytes: 16,998,911,923 (16.99 GB) > Non-fatal FD errors: 2 > SD Errors: 0 > FD termination status: Error > SD termination status: Error > Termination: *** Backup Error *** > > 24-Jul 14:37 elizabeth-dir: civeng54.2007-07-24_09.05.01 Error: open mail > pipe /usr/sbin/bsmtp -h localhost -f "(Bacula) [EMAIL PROTECTED]" -s "Bacula: > Backup Fatal Error of civeng54 Full" [EMAIL PROTECTED] failed: ERR=Cannot > allocate memory > 24-Jul 14:37 elizabeth-dir: Error: open mail pipe /usr/sbin/bsmtp -h > localhost -f "(Bacula) [EMAIL PROTECTED]" -s "Bacula daemon message" > [EMAIL PROTECTED] failed: ERR=Cannot allocate memory > > > > ------------------------------------------------------------------------- > This SF.net email is sponsored by: Splunk Inc. > Still grepping through log files to find problems? Stop. > Now Search log events and configuration files using AJAX and a browser. > Download your FREE copy of Splunk now >> http://get.splunk.com/ > _______________________________________________ > Bacula-users mailing list > Bacula-users@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/bacula-users -- Arno Lehmann IT-Service Lehmann www.its-lehmann.de ------------------------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now >> http://get.splunk.com/ _______________________________________________ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users