Hi, On 1/17/2007 5:28 PM, David Romerstein wrote: > Hi, folks, > > I'm relatively new to bacula, but experienced with a number of other backup > programs (so, none of the concepts involved in running bacula are > particularly > foreign to me). My primary purpose in installing and running bacula is to > archive some 2TB of data we've got stored. I'm running a single Dell Ultrium > LTO-3 (400/800GB) drive, no autoloader, under RHEL 4. > > I've been googling for answers since last night, in addition to going over > the > docs with a fine-tooth comb, and I'm not making any headway. > > Here's my problem. The first tape swap (at about 420GB) worked just fine. Per > the docs, I unmounted the current tape, inserted a new one, labelled it, and > *whee*, off we go for another two and a half days of writing. Over this past > weekend, tape #2 filled. Sunday afternoon, I got an 'Intervention required' > email: > > 14-Jan 16:32 srv01-sd: Job BackupFileStore.2007-01-08_17.07.02 waiting. > Cannot > find any appendable volumes. > Please use the "label" command to create a new Volume for: > Storage: "DellUltrium" (/dev/nst0) > Media type: LTO-3 > Pool: Default > > ... followed shortly thereafter by an error email: > > 14-Jan 17:09 srv01-dir: BackupFileStore.2007-01-08_17.07.02 Error: > message.c:483 Mail program terminated in error. > CMD=/sbin/bsmtp -h localhost -f "(Bacula) [EMAIL PROTECTED]" -s > "Bacula: Backup Fatal Error of srv01-fd Full" [EMAIL PROTECTED] > ERR=Child died from signal 15: Termination
That does look bad. Bacula tried to inform you of the job having ended with an error and couldn't send that mail. I know these symptoms from machines running out of memory. > ... and no further intervention mails. > > Yesterday was my first opportunity to get to the colo facility in which these > devices sit, and I tried another tape swap - first, unmounted the tape in the > drive, ejected it, inserted a new tape, labelled it 'xRaid1_3' (following the > naming convention I'd established for the first two tapes, 'xRaid1' and > 'xRaid1_2'), and mounted it. > > There's been no activity since then, no data has been written to the new tape. That's not really astonishing as that job has been ended. > I'm being told the job's not running: > > *status jobid=9 > JobId 9 is not running. > > Storage status shows me this: > > *status storage=Ultrium > Automatically selected Catalog: MyCatalog > Using Catalog "MyCatalog" > Connecting to Storage daemon Ultrium at srv01:9103 > > srv01-sd Version: 2.0.0 (04 January 2007) i686-pc-linux-gnu redhat Enterprise > release > Daemon started 08-Jan-07 17:04, 9 Jobs run since started. > Heap: bytes=217,936 max_bytes=355,104 bufs=105 max_bufs=134 > > Running Jobs: > Writing: Full Backup job BackupFileStore JobId=9 Volume="" > pool="Default" device=""DellUltrium" (/dev/nst0)" > Files=59,328,036 Bytes=848,654,987,013 Bytes/sec=1,221,089 > FDReadSeqNo=543,013,723 in_msg=365059758 out_msg=5 fd=6 > ==== Ok, so we have an SD in inconsistent state. You'll have to restart it, I assume. > Jobs waiting to reserve a drive: > ==== > > Terminated Jobs: > JobId Level Files Bytes Status Finished Name > =================================================================== > 8 Full 10,001 121.6 M OK 08-Jan-07 17:03 BackupFileStore > 18 Incr 0 0 OK 14-Jan-07 17:09 Client1 > 20 Incr 0 0 OK 14-Jan-07 17:10 Client1 > 19 Full 1 8.861 G OK 14-Jan-07 17:53 BackupCatalog > 21 Full 1 8.861 G OK 14-Jan-07 18:39 BackupCatalog > 22 Diff 0 0 OK 14-Jan-07 23:05 Client1 > 23 Full 1 8.861 G OK 14-Jan-07 23:55 BackupCatalog > 24 Incr 0 0 OK 15-Jan-07 23:05 Client1 > 25 Full 1 8.861 G OK 16-Jan-07 00:02 BackupCatalog > 26 Full 1 8.861 G OK 16-Jan-07 23:58 BackupCatalog > ==== > > Device status: > Device "FileStorage" (/tmp) is not open. > Device "DellUltrium" (/dev/nst0) is mounted with Volume="xraid1_3" > Pool="Default" > Device is BLOCKED waiting for media. > Total Bytes Read=0 Blocks Read=0 Bytes/block=0 > Positioned at File=1 Block=0 > ==== That might be a state which normally exists for a short time. Looks like The media is labeled and mounted, and the SD has to decide if this volume is valid for the job running. > In Use Volume status: > xraid1_3 on device "DellUltrium" (/dev/nst0) > ==== > > HELP! I'd really rather not restart this job if I don't have to ('6 days of > archiving wasted' != 'conducive to low stress levels'). I've unmounted and > remounted this tape, with no change. > > Any suggestions? Thanks! You'll have to restart that job I fear. If you want to know more about the underlying problem, reading through the console messages file should give you the information that couldn't be mailed to you. Of course it is now advisable to make sure you don't have a serious problem. I'd run the SD in debug mode with a level of at least 100 and capture the log output. Also, observing memory load during backup operations might reveal a problem. Arno > -- D > > ------------------------------------------------------------------------- > Take Surveys. Earn Cash. Influence the Future of IT > Join SourceForge.net's Techsay panel and you'll get the chance to share your > opinions on IT & business topics through brief surveys - and earn cash > http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV > _______________________________________________ > Bacula-users mailing list > Bacula-users@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/bacula-users -- IT-Service Lehmann [EMAIL PROTECTED] Arno Lehmann http://www.its-lehmann.de ------------------------------------------------------------------------- Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys - and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV _______________________________________________ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users