Sorry, reposting topic with appropriate version number in subject line. On 12/14/2010 09:21 AM, Stephen Thompson wrote: > On 07/28/2010 11:56 AM, Kern Sibbald wrote: >> On Wednesday 28 July 2010 19:44:49 Stephen Thompson wrote: >>> After running for 3 months without this problem, it happened again last >>> night. We are running 5.0.2 at this point. >> >> I believe that the email problem is fixed in 5.0.3 which we will release >> sometime this month, and which is in the git Source Forge repo, under >> Branch-5.0. >> >> Kern > > > We have been running 5.0.3 successfully since August (again about 3 > months) and just had this problem occur again last night. Same symptoms: > > 1) SD crashes > 2) DIR sends out continuous stream of emails (apparent infinite loop) > > emails say: > "client-fd JobId 100001: Fatal error: backup.c:1048 Network send > error to SD. ERR=Broken pipe" > > So, I reckon the problem was not fixed in 5.0.3. > I'll post traceback to bugs.bacula.org. > > thanks, > Stephen > > > >> >>> >>> Stephen >>> >>> On 04/15/2010 10:25 AM, Stephen Thompson wrote: >>>> Hello, >>>> >>>> I have just now experienced a possible new bug with bacula 5.0.1. >>>> >>>> The symptoms are this: >>>> >>>> bacula-sd crashes >>>> bacula-dir continues to run >>>> bacula-dir then spews out identical "Intervention needed" emails until >>>> manually restarted >>>> >>>> The first time this happened over a weekend and upon returning I found >>>> my inbox has about 120,000 bacula emails, all the SAME and of this type: >>>> >>>> "15-Apr 10:02 client-fd JobId 100001: Fatal error: backup.c:1048 Network >>>> send error to SD. ERR=Broken pipe" >>>> >>>> It happened again just now (second time since upgrading from 3.0.3 to >>>> 5.0.1) and I managed to stop the director with only a few thousand >>>> emails going out. >>>> >>>> So there are really 2 issues here: >>>> >>>> 1) >>>> Why does the director apparently get stuck in an infinite loop of >>>> sending the same email message? Is this a known bug? >>>> >>>> 2) >>>> Regarding the SD, I received one alert of this type, the rest like the >>>> above: >>>> >>>> "15-Apr 10:02 server-sd: ERROR in lock.c:268 Failed ASSERT: >>>> dev->blocked()" >>>> >>>> A traceback like: >>>> -- >>>> ptrace: Operation not permitted. >>>> /var/bacula/work/29091: No such file or directory. >>>> $1 = 0 >>>> /opt/bacula-5.0.1/scripts/btraceback.gdb:2: Error in sourced command >>>> file: No symbol "exename" in current context. >>>> -- >>>> >>>> And a bactrace like: >>>> -- >>>> Attempt to dump current JCRs >>>> JCR=0x19a24888 JobId=100000 name=client_1.2010-04-14_18.02.33_41 >>>> JobStatus=l use_count=1 >>>> JobType=B JobLevel=F >>>> sched_time=14-Apr-2010 21:35 start_time=14-Apr-2010 21:35 >>>> end_time=31-Dec-1969 16:00 wait_time=31-Dec-1969 16:00 >>>> db=(nil) db_batch=(nil) batch_started=0 >>>> JCR=0x1981b248 JobId=100001 name=client_10.2010-04-14_20.00.15_04 >>>> JobStatus=R >>>> use_count=1 >>>> JobType=B JobLevel=I >>>> sched_time=15-Apr-2010 09:15 start_time=15-Apr-2010 09:15 >>>> end_time=31-Dec-1969 16:00 wait_time=31-Dec-1969 16:00 >>>> db=(nil) db_batch=(nil) batch_started=0 >>>> Attempt to dump plugins. Hook count=0 >>>> -- >>>> >>>> Both clients and server seem healthy, except for the SD crash. >>>> Any ideas? >>>> >>>> >>>> thanks! >>>> Stephen >>>> >>>> >>>> ------------------------------------------------------------------------- >>>> ------------ Further info: >>>> >>>> My catalog... >>>> >>>> mysql-5.0.77 (64bit) MyISAM >>>> 210Gb in size >>>> 1,412,297,215 records in File table >>>> note: database built with bacula 2x scripts, >>>> upgraded with 3x scripts, then again with 5x scripts >>>> (i.e. nothing customized along the way) >>>> >>>> My OS& hardware for bacula DIR+SD server... >>>> >>>> Centos 5.4 (fully patched) >>>> 8Gb RAM >>>> 2Gb Swap >>>> 1Tb EXT3 filesystem on external fiber RAID5 array >>>> (dedicated to database, incl. temp files) >>>> 2 dual-core [AMD Opteron(tm) Processor 2220] CPUs >>>> StorageTek SL500 Library with 2 LTO3 Drives >>>> >>>> >>>> >>>> >>>> >>>> ------------------------------------------------------------------------- >>>> ----- Download Intel® Parallel Studio Eval >>>> Try the new software tools for yourself. Speed compiling, find bugs >>>> proactively, and fine-tune applications for parallel performance. >>>> See why Intel Parallel Studio got high marks during beta. >>>> http://p.sf.net/sfu/intel-sw-dev >>>> _______________________________________________ >>>> Bacula-devel mailing list >>>> Bacula-devel@lists.sourceforge.net >>>> https://lists.sourceforge.net/lists/listinfo/bacula-devel >> > >
-- Stephen Thompson Berkeley Seismological Laboratory step...@seismo.berkeley.edu 215 McCone Hall # 4760 404.538.7077 (phone) University of California, Berkeley 510.643.5811 (fax) Berkeley, CA 94720-4760 ------------------------------------------------------------------------------ Lotusphere 2011 Register now for Lotusphere 2011 and learn how to connect the dots, take your collaborative environment to the next level, and enter the era of Social Business. http://p.sf.net/sfu/lotusphere-d2d _______________________________________________ Bacula-devel mailing list Bacula-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-devel