Arno Lehmann wrote:
Hi.
I'll only suggest what to try to check if your tapes are ok for bacula.
Tom Morgan wrote:
...lots of stuff snipped
15-Apr 09:17 drakul-sd: Please mount Volume "mp0008" on Storage
Device "Sun L280" for Job hella.2005-04-15_01.05.06
15-Apr 09:19 drakul-sd: hella.2005-04-15_01.05.06 Fatal error: Fatal
device error: ERR=label.c:202 Expecting Volume Label, got FI=0
Stream=0 len=64412
...
This is what I am getting when I try to mount the volume:
*mount mp0006
Nah... you don't give the volume name, hat's read from the tape, you
give the device name...
Storage resource "mp0006": not found
Automatically selected Storage: DLT-7000
... but in this case it didn't matter.
3905 Device /dev/nst0 open but no Bacula volume is mounted.
If this is not a blank tape, try unmounting and remounting the Volume.
Sorry, I'll not go through the whole of your configuration...
Well, I hat similar cases: For some reason, bacula wouldn't recognize
the volumes in tape drives.
I found out the following:
- Tapes can - obviously - be damaged, though this should not happen
with all your tapes.
- Sometimes, especially with certain tape drives, the drives choke
when bacula tries to mount a device while it's still loading or
calibrating a tape. More info below.
- Block sizes. I had tapes written with other block sizes than what I
finally used, and once you insert one of these tapes you need to
relabel them.
- The SD can choke. I think I saw this once or twice, but since I
don't usually run the daemons with logging that's hard to verify, and
I've never tried to reproduce such errors.
- Sometimes tha SD seems to be stuck when a tape drive is trated in a
way it doesn't like, like doing a forced eject and loading of another
tape while the device is opened by bacula. This might be scsi driver
dependent, but see below...
A procedure I use when a tape is not recognized, which most often
happens after I did something I should not have done:
1. Unmount using the console. When the console hangs and you are
impatient, quit it using C-c and kill it from the shell. Restart,
verify with "status storage". Wait. Wait longer, verify status again,
sometimes the SD recovers.
2. mount from the console. When you get a status like "device in
acquire" you need time and good nerves. Read further at 10.
3. If the tape label is not recognized as being a bacula label,
unmount or release the device.
4. Use, from the shell, mt to verify the tape drives state and, very
important, block size setting.
4.a) If it's wrong, unmount the tape and reload it, Check again.
4.b) If it's still wrong, use mt with the tape loaded to set the right
values.
5. Use btape to try to read the tape label.
6. If btape doesn't give the proper label and you are sure that tape
can be overwritten, like when the catalog says the tape is purged or
recycled, use the label command from btape to assign the same label to
he tape again.
7. Use readlabel from btape to verify the label. Note, preferrably on
it's label, that the tape is probably damaged. If you again encounter
problems with it, remove it from your pool.
8. Unload the tape, reload it, use mt and btape to verify it can be
read now.
9. From baculas console, mount the tape. It should work now.
10. This situation took some effort to recover :-)
First, wait and see if the situation recovers by itself. Repeat until
you run out of time or nerves.
11. If you don't mind, simply restart the computer with the tape
attached. Goto step 2.
11. If that is not an option, but you can restart the SD alone, do
that: use kill or bacula stop to stop bacula-sd. Verify all instances
are shut down, then restart it. Start again at step 2.
12. If your tape drive is an external one, you can try turning it off,
waiting a while, and turning it on again. Do _not_ remove SCSI cables,
unless you know the stuff is meant for hot-plugging.
13. Verify the tape device is available to the kernel. Under linux,
for example, do 'cat /proc/scsi/scsi' or dmesg. You can use commands
like 'echo scsi add-single-device 0 0 4 0 > /proc/scsi/scsi' or
similar to inform the kernel of it's presence if it is not listed.
(Seriously, I had one such case: A hard disk on the scsi bus with some
tape drives died, the scsi HBA reset and reset and reset, finally gave
up, when one tape drive decided it wouldn't like that. Then, I managed
to take the disk out of its volume group, reset the tape drive by
power-cycling, added it o the driver like described, and my backup
continued. Later verification showed the backup was ok. And yes, I do
like old and unreliable hardware ;-)
14. Continue at step 2.
Well, that's an outline that should be modified according to your
systemand knowledge. Looking in the systems log files might help, and
having scsi drivers which can log their problems in a way you
understand is also very helpful. I at least had some very interesting
time reading the source code of linux' aic7xxx driver to see what the
thousands of error log lines were telling me.
In fact, I was impressed how robust a linux system can be. Bacula
itself also recovers from many hardware problems, but the timeouts in
the drivers can be really long.
Of course, you will find that most problems come from damaged tapes,
misconfiguration, and operator work in conjunction with funny hardware
like tape drives that don't like being talked to while they load a tape.
Arno
What I ended up doing is deleting the VolumeName mp0006 from the db and
relabeling it now it is working. Looks as if I am going to have to do
that to all the tapes :(
--
Tom Morgan
[EMAIL PROTECTED]
-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click
_______________________________________________
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users