Hello again,

Update with questions: although disk was pretty bad, was able to copy it with 
dd_rescue - took 10 minutes to copy 8 GB partition to a file

Catalog was functional, but still not able to dump.

Reloaded last good catalog backup - fine so far, but the very fine manual 
could add the information under "Restoring When Things Go Wrong" that when 
you have dropped the bacula tables, you are not supposed to "make" them again 
if you intend to reload a dumped catalog.

It could also reasonably add this: "psql dbname < infile" under postgres 
reloading.

Then trying to bscan the last backups into the catalog one volume at a time to 
see if that works - then a lot of them on the same line - until bscan gives 
up - eventually finding out that it does not work one at a time.

Googling then finds this:
>3. If you *really* need to use bscan, be sure to feed it *all* the 
>appropriate 
>volumes in a *single* bscan execution (not one for each tape) with the 
>volumes specified in the right order.  This was clearly and correctly 
>documented, IMO, but I've added more to this effect ...
>
>-- 
>Best regards,
>
>Kern

I must say that I read the small chapter on bscan several times - mostly to 
search if there were anything specific on diskbased volumes - which there 
isn't, and the only thing I can find so far coming close to the above 
information is:

>If you have multiple tapes, you can scan them with:
> bscan -s -m -c bacula-sd.conf -v -V Vol001\|Vol002\|Vol003 /dev/nst0
>
> You should, where ever possible try to specify the tapes in the order they 
>are written. However, bscan can handle scanning tapes that are not 
>sequential. Any incomplete records at the end of the tape will simply be 
>ignored in that case

So, unless I'm blind (which I am sometimes), the inclusion of the above mail 
extract from kern plus the information that bscan can only handle a few 
volumes on the command-line, in the manual, would have saved me an hour 
today.

Under the chapter: Restoring the Server, it says that the Disaster Recovery CD 
is only for Clients, and then the first point - bringing up the static 
versions is left without any further instructions.

Is this still an exercise left to explore?
In cases where the disk is only partially lost, or you have only lost one disk 
out of several, or you are disk based, recovering the original server may 
still be the best way - if there is a clear set of instructions

Last - under Disaster Recovery CD, the text supposes that you have compiled 
yourself. I have installed rpm's. The README file still states that I call 
Make with <bacula source>.
Am I to install the source rpm in order to create the rescue CD?

Thanks for feed-back on this

Regards

Steen

Mandag 04 september 2006 14:09 skrev steen meyer:
> Hi all,
>
> Bacula catalog in Postgress has become corrupted it seems due to disk
> errors. I can still make a backup, but cannot dump the catalog anymore.
>
> It does write most of the catalog, but doesn't complete.
>
> This occurs after a lot of full backups this weekend, so I am not fond of
> deleting the catalog and reloading the one from friday.
>
> The catalog resides on an ordinary IDE disk.
>
> Does anyone have a suggestion as to how I can best rescue the system - how
> can I get the full catalog copied over to a new disk and continue to run?
>
> I have been thinking of simply restarting the server, then it would repair
> the disk structure if possible, but how can I best prepare for the scenario
> that it won't and can't start or that the repair will destroy the catalog?
>
>
> Errors here:
>
> 04-Sep 13:24 adm-backup-dir: RunBefore: pg_dump: ERROR:  could not read
> block 202758 of relation 1663/17230/2194377: Success
> 04-Sep 13:24 adm-backup-dir: RunBefore: pg_dump: SQL command to dump the
> contents of table "file" failed: PQendcopy() failed.
> 04-Sep 13:24 adm-backup-dir: RunBefore: pg_dump: Error message from server:
> ERROR:  could not read block 202758 of relation 1663/17230/2194377: Success
> 04-Sep 13:24 adm-backup-dir: RunBefore: pg_dump: The command was: COPY
> public.file (fileid, fileindex, jobid, pathid, filenameid, markid, lstat,
> md5) TO stdout;
> 04-Sep 13:24 adm-backup-dir: BackupCatalog.2006-09-04_13.22.45 Fatal error:
> RunBeforeJob error: ERR=Child exited with code 1
>
> There are hardware errors in syslog:
> Sep  4 13:24:12 adm-backup kernel: hda: dma_intr: status=0x51 { DriveReady
> SeekComplete Error }
> Sep  4 13:24:12 adm-backup kernel: hda: dma_intr: error=0x40
> { UncorrectableError }, LBAsect=344781, sector=344775
> Sep  4 13:24:12 adm-backup kernel: ide: failed opcode was: unknown
> Sep  4 13:24:12 adm-backup kernel: end_request: I/O error, dev hda, sector
> 344775

-- 
Regards

Steen

-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

Reply via email to