Arno Lehmann wrote: > Hi. > > David Marcin wrote: > >> Bacula exits unexpectedly, the only thing I can think of is that the >> database has somehow become corrupted in such a way to kill the >> director. >> >> As far as I know the system has been running for about 2 months >> unchanged, however I am not the only person to have administrator rights >> on the machine so I cannot be certain. I have upgraded to the latest >> version of bacula available via debian's apt system. Details follow. >> >> # bacula-dir -? >> Copyright (C) 2000-2004 Kern Sibbald and John Walker >> >> Version: 1.36.2 (28 February 2005) >> >> And the log of the error: >> >> # sed 's/quateams/backups/g' file >> # bacula-dir -f -d99 >> bacula-dir: dird.c:131 Debug level = 99 >> backups-dir: cram-md5.c:52 send: auth cram-md5 >> <[EMAIL PROTECTED]> ssl=0 >> backups-dir: cram-md5.c:70 Authenticate OK K4+L5x5dVisA4+Erjy4IeB >> backups-dir: cram-md5.c:120 sending resp to challenge: >> fUcX5UYxq5/44iNvf8/pMA >> backups-dir: ua_run.c:481 JobType=B >> backups-dir: job.c:108 Open database >> backups-dir: job.c:121 DB opened >> backups-dir: btimers.c:169 Start bsock timer 0x80d0488 tid=0x10005 for >> 600 secs at 1115840968 >> backups-dir: cram-md5.c:120 sending resp to challenge: >> YxpC6AB+3j/VhB04dV+H+A >> backups-dir: cram-md5.c:52 send: auth cram-md5 >> <[EMAIL PROTECTED]> ssl=0 >> backups-dir: cram-md5.c:70 Authenticate OK L74xUAR2vHEaNiUN3CwJuC >> backups-dir: btimers.c:183 Stop bsock timer 0x80d0488 tid=0x10005 at >> 1115840969. >> backups-dir: fd_cmds.c:87 Opened connection with File daemon >> backups-dir: btimers.c:169 Start bsock timer 0x80d2508 tid=0x10005 for >> 600 secs at 1115840969 >> backups-dir: cram-md5.c:120 sending resp to challenge: >> 3kdFKAdWj6Ni2HRo10+9KA >> backups-dir: cram-md5.c:52 send: auth cram-md5 >> <[EMAIL PROTECTED]> ssl=0 >> backups-dir: cram-md5.c:70 Authenticate OK CB/ligxuxF/1f/+EA4pYNC >> backups-dir: btimers.c:183 Stop bsock timer 0x80d2508 tid=0x10005 at >> 1115840969. >> backups-dir: ua_status.c:104 status:status: >> backups-dir: ua_status.c:137 do_prompt: select daemon >> backups-dir: ua_status.c:141 item=0 >> backups-dir: ua_status.c:104 status:status: >> backups-dir: ua_status.c:137 do_prompt: select daemon >> backups-dir: ua_status.c:141 item=2 >> backups-dir: fd_cmds.c:87 Opened connection with File daemon >> backups-dir: btimers.c:169 Start bsock timer 0x80d2518 tid=0xc004 for >> 600 secs at 1115841057 >> backups-dir: cram-md5.c:120 sending resp to challenge: >> Gy/J0W+QmSYGOy17X9dlXB >> backups-dir: cram-md5.c:52 send: auth cram-md5 >> <[EMAIL PROTECTED]> ssl=0 >> backups-dir: cram-md5.c:70 Authenticate OK t9/PBGFe26Uark4WPxFeIB >> backups-dir: btimers.c:183 Stop bsock timer 0x80d2518 tid=0xc004 at >> 1115841058. >> backups-dir: ua_status.c:336 Connected to file daemon >> backups-dir: ua_status.c:104 status:status: >> backups-dir: ua_status.c:137 do_prompt: select daemon >> backups-dir: ua_status.c:141 item=2 >> backups-dir: fd_cmds.c:87 Opened connection with File daemon >> backups-dir: btimers.c:169 Start bsock timer 0x80d1a10 tid=0xc004 for >> 600 secs at 1115841069 >> backups-dir: cram-md5.c:120 sending resp to challenge: >> oW+kAChWM5I3YUxuP//GYD >> backups-dir: cram-md5.c:52 send: auth cram-md5 >> <[EMAIL PROTECTED]> ssl=0 >> backups-dir: cram-md5.c:70 Authenticate OK 0G+xN/RAFTpd7EhbDQQ/OA >> backups-dir: btimers.c:183 Stop bsock timer 0x80d1a10 tid=0xc004 at >> 1115841069. >> backups-dir: ua_status.c:336 Connected to file daemon >> backups-dir: ua_prune.c:249 select sql=SELECT JobId from Job WHERE >> JobTDate<1113249193 AND ClientId=2 AND PurgedFiles=0 >> backups-dir: ua_prune.c:279 Delete JobId=413 >> bacula-dir: src/pager.c:570: pager_playback_one_page: Assertion >> `pPg->nRef==0 || pPg->pgno==1' failed. >> Aborted > > > What you report is the directors log during user interaction, right?
Yes, it is the output of running bacula-dir from the command line, in the foreground, with debug level 99 (manual said higher is better, i figured that was pretty high ;) ) > > What I *think* I see is that you start a job manually, and after > selecting the client the director crashes, probably where it chooses a > job to base a differential or incremental backup upon. Sorry, I should have provided more details about what was going on. I started a job, which then went along its merry way detecting that it should run a full backup, beginning to back up the files (a query to the fd shows that it is indeed processing files) then at some ambiguous point in the future before actually finishing the job and marking it in the database, the director crashes. Upon restarting the director this is in the "messages" log: 11-May 23:04 backups-dir: No prior Full backup Job record found. 11-May 23:04 backups-dir: No prior or suitable Full backup found. Doing FULL backup. 11-May 23:04 backups-dir: Start Backup JobId 553, Job=3jane_Backup.2005-05-11_23.04.52 11-May 23:04 backups-sd: Volume "Full-0007" previously written, moving to end of data. > > Now, I didn't read the source, esp. src/pager.c around lines 570, but > for me it would be helpful to have some other information: > - First, screenshot of your interaction, for simplicity, here is the input/output from bconsole: *run Using default Catalog name=MyCatalog DB=bacula A job name must be specified. The defined Job resources are: 1: 3jane Backup <snip - other options> Select Job resource (1-16): 1 Run Backup job JobName: 3jane Backup FileSet: 3jane Level: Incremental Client: 3jane-fd Storage: File Pool: Default When: 2005-05-11 23:09:00 Priority: 10 OK to run? (yes/mod/no): yes Job started. JobId=554 * > - Second, the relevant configuration (client, fileset, pools, storage) bacula-dir.conf JobDefs { Name = "3jane" Type = Backup Level = Incremental Client = 3jane-fd FileSet = "3jane" Schedule = "WeeklyCycle" Storage = File Messages = Standard Pool = Default Full Backup Pool = Full Incremental Backup Pool = Incr Differential Backup Pool = Diff Priority = 10 } Job { Name = "3jane Backup" JobDefs = "3jane" Write Bootstrap = "/var/lib/bacula/3jane.bsr" } FileSet { Name = "3jane" Include { Options { signature = MD5 } File = /root File = /etc File = /home File = /var } } Client { Name = 3jane-fd Address = 192.168.1.101 FDPort = 9102 Catalog = MyCatalog Password = "*******" # password for FileDaemon File Retention = 30 days # 30 days Job Retention = 6 months # six months AutoPrune = yes # Prune expired Jobs/Files } # Default pool definition Pool { Name = Default Pool Type = Backup Purge Oldest Volume = yes Recycle = yes # Bacula can automatically recycle Volumes Recycle Oldest Volume = yes Label Format = "Volume-" AutoPrune = yes # Prune expired volumes Volume Retention = 5 days Accept Any Volume = yes # write on any volume in the pool Maximum Volumes = 5 Maximum Volume Jobs = 1 } Pool { Name = Full Pool Type = Backup Purge Oldest Volume = yes Recycle = yes # Bacula can automatically recycle Volumes Maximum Volume Jobs = 1 Recycle Oldest Volume = yes Label Format = "Full-" AutoPrune = yes # Prune expired volumes Volume Retention = 90 days Accept Any Volume = yes # write on any volume in the pool Maximum Volumes = 10 } Pool { Name = Diff Pool Type = Backup Purge Oldest Volume = yes Recycle = yes # Bacula can automatically recycle Volumes Recycle Oldest Volume = yes Label Format = "Diff-" AutoPrune = yes # Prune expired volumes Volume Retention = 21 days Accept Any Volume = yes # write on any volume in the pool Maximum Volumes = 40 Maximum Volume Jobs = 1 } Pool { Name = Incr Pool Type = Backup Purge Oldest Volume = yes Recycle = yes # Bacula can automatically recycle Volumes Recycle Oldest Volume = yes Label Format = "Incr-" AutoPrune = yes # Prune expired volumes Volume Retention = 7 days Accept Any Volume = yes # write on any volume in the pool Maximum Volumes = 10 Maximum Volume Jobs = 10 } bacula-sd.conf Device { Name = FileStorage Media Type = File Archive Device = /mnt/backups/bacula/ LabelMedia = yes; # lets Bacula label unlabeled media Random Access = Yes; AutomaticMount = yes; # when device opened, read it RemovableMedia = no; AlwaysOpen = no; } > - What OS and version of bacula runs on the client? The client being backed up is the same host that runs the director and storage daemon, we have only a few computers :) Bacula does not back up the storage volumes. > - Can you run other jobs on the client? > - Can you run identical jobs on the client? I dont understand what you mean. Do you mean with different directors? I can run an estimate job on that client successfully. I can also run other jobs successfully from the director > - What has the catalog about Job 413? There is no Job 413, perhaps it was purged? In any case I tried specifically doing a new full backup and it still fails. I hope that is enough information. If we can't figure anything out I suppose I can try purging everything and running backups from scratch again. Thanks for your help David > > If something with the database is wrong you can try to repair it. > If something with Job 413 as a reference job is wrong, you can run a > new full backup. > > Arno > >> It appears to be one particular backup that fails regularly. When run >> manually, others seem to complete, while this one fails. >> >> I'd rather not dump the backups that have been made, but if it is >> necessary it can be done. >> >> David >> >> >> ------------------------------------------------------- >> This SF.Net email is sponsored by Oracle Space Sweepstakes >> Want to be the first software developer in space? >> Enter now for the Oracle Space Sweepstakes! >> http://ads.osdn.com/?ad_id=7393&alloc_id=16281&op=click >> _______________________________________________ >> Bacula-users mailing list >> Bacula-users@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/bacula-users > > ------------------------------------------------------- This SF.Net email is sponsored by Oracle Space Sweepstakes Want to be the first software developer in space? Enter now for the Oracle Space Sweepstakes! http://ads.osdn.com/?ad_id=7393&alloc_id=16281&op=click _______________________________________________ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users