Hello,
On 1/17/2006 12:10 AM, Alexander Bergolth wrote:
On 01/14/06 15:20, Julien Cigar wrote:
I had exactly the same problem some months ago, full thread can be
found here:
http://www.nabble.com/Still-problems-with-my-Sony-tapes-t504270.html
Thanks for the hint. But I don't think that this is the same problem.
Your errors were DDS related, I'm using a DLT drive. Besides, the
scsi-errors are pretty different and your problems seem to have been
scsi-id related. I don't suspect this to be the problem in my case, as
the server was shipped with this configuration and the tape drive has
been working without any problem now for nearly a year.
I believe my problem might have been caused by a drive firmware problem
or something like that. After that error, the tape was totally
unresponsive, when I tried to query it using mt status or tapeinfo, the
following errors showed up in the syslog:
scsi: reservation conflict: host 0 channel 1 id 6 lun 0
st0: Error 70018 (sugg. bt 0x0, driver bt 0x0, host bt 0x7).
After a cold restart, the drive works again now.
Then let's hope it remains that way... unfortunately, I experienced the
similar problems, though not often, and I could never find any definite
reason for it.
I suspect that there may be slight instabilities in the SCSI HBAs I use,
but nothing I could prove. And, also, I use rather aged hardware here,
so it's quite possibly related to hardware errors, too.
However the other issue is that apparently bacula has been informed
about the error, but it seems to have interpreted it as an end of medium
condition instead of aborting the backup:
Yup, that's normal because of two reasons:
- First, to logically finish a volume after an error that's Baculas only
choice, and
- second, there is, to my knowledge, no portable, reliable way to
distinguis between EOM and real hardware errors from an application.
-------------------- snipp! --------------------
14-Jan 01:57 samba-sd: Samba-Homes.2006-01-14_01.05.00 Error:
block.c:538 Write error at 6:3819 on device "DLT1" (/dev/nst0).
ERR=Input/output error.
14-Jan 01:57 samba-sd: Samba-Homes.2006-01-14_01.05.00 Error: Error
writing final EOF to tape. This Volume may not be readable.
dev.c:1553 ioctl MTWEOF error on "DLT1" (/dev/nst0). ERR=Input/output
error.
14-Jan 01:57 samba-sd: End of medium on Volume "Weekly-2005-06-12_8"
Bytes=6,245,922,599 Blocks=96,818 at 14-Jan-2006 01:57.
-------------------- snipp! --------------------
Is there an option to let bacula abort a backup after a write error?
Not as far as I know, because then it might choke an any EOM and on any
temporary failure.
Cheers,
--leo
P.S.: I'd also still be interested in other users experiences with the
durability of DLT tapes and some recommendations on drive cleaning and
maintainance.
I must have missed that the last time...
Well, here I use a really aged 7-Slot DLT4000 autoloader. I'm not sure
how old it actually is, but for those of you who followed the
development: It's an ADIC FastStor 4000 which I bought second hand.
Might be about eight years old. When I installed it, according to the
internal counters, it has not been used really often. Currently, it
claims that it's loaded the drive less than 2200 times. It usually works
without problems.
I have also tried used DLT tapes, and usually found them to be reliable
day-to-day use. These cartridges were usually used only once or twice,
though, and then stored in good conditions. Anyway, with my usage
profile, I can't find any significant differences in reliabilty between
new and used DLT-IV tapes written with DLT4000 format. I'd estimate the
percentage of problematic or even damaged tapes to be below 5%. (I
consider a tape bad when the following happens: (I test used tapes by
writing some GB of random data to them) write failure during test write,
problems labeling a tape that don't disappear after manually writing
some random data, testing with btape, and re-labeling them, or a reduced
capacity during operation.
Currently, in my Bacula catalog, I have 64 DLT-IV tapes of which 2 are
damaged: One was a used tape I never could write any data to, and one
was removed just some times agob because, after recycling, it couldn't
be recognized any more. I have not yet tested if the tape is still
writeable because I wouldn't trust it any more anyway.
Concerning drive maintenance: In my home-office environment (e.g.
neither clean-room conditions nor cigartte smoke or unusual amounts of
dust, I have to clean the DLT drive avery few months, or about 15-25
tapes written. (Usually, I follow the drives indication that it needs to
be cleaned.) That's about what I know as normal DLT cleaning cycles from
other sites, too.
In a dusty environment, especially with cigarette smoke, you will need
to clean more often, and you will have to replace the drive much more often.
In short: I know DLT as a very robust technology, bot concerning drives
and tapes.
Arno
Alexander Bergolth wrote:
Hi!
Tonight, I got the following I/O error for the first time:
The Drive is a DLT1 tape:
Vendor: BNCHMARK Model: DLT1 Rev: 5538
Type: Sequential-Access ANSI SCSI revision: 02
-------------------- snipp! --------------------
14-Jan 01:55 samba-sd: Writing spooled data to Volume. Despooling
536,904,185 bytes ...
14-Jan 01:57 samba-sd: Samba-Homes.2006-01-14_01.05.00 Error:
block.c:538 Write error at 6:3819 on device "DLT1" (/dev/nst0).
ERR=Input/output error.
14-Jan 01:57 samba-sd: Samba-Homes.2006-01-14_01.05.00 Error: Error
writing final EOF to tape. This Volume may not be readable.
dev.c:1553 ioctl MTWEOF error on "DLT1" (/dev/nst0). ERR=Input/output
error.
14-Jan 01:57 samba-sd: End of medium on Volume "Weekly-2005-06-12_8"
Bytes=6,245,922,599 Blocks=96,818 at 14-Jan-2006 01:57.
14-Jan 01:57 samba-dir: Recycled volume "Weekly-2005-04-22_2"
14-Jan 01:57 samba-sd: Please mount Volume "Weekly-2005-04-22_2" on
Storage Device "DLT1" (/dev/nst0) for Job
Samba-Homes.2006-01-14_01.05.00
-------------------- snipp! --------------------
This is the dmesg output:
-------------------- snipp! --------------------
st0: Error with sense data: <6>st0: Current: sense key: Not Ready
Additional sense: Logical unit not ready, initializing cmd. required
st0: Error 400f4 (sugg. bt 0x0, driver bt 0x0, host bt 0x4).
st0: Error 400f4 (sugg. bt 0x0, driver bt 0x0, host bt 0x4).
[...]
-------------------- snipp! --------------------
As I don't have very much experience with DLT tapes:
Does this indicate a worn out tape or should I simply clean the drive
with the cleaning tape?
How many backups should be possible with a DLT tape / at which
intervals should a tape be changed?
At which intervals should the drive be cleaned?
In this case, the current job is still running and waiting for
another tape. Is there an option to let bacula abort the job, if such
an error occurs?
Cheers,
--leo
--
IT-Service Lehmann [EMAIL PROTECTED]
Arno Lehmann http://www.its-lehmann.de
-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems? Stop! Download the new AJAX search engine that makes
searching your log files as easy as surfing the web. DOWNLOAD SPLUNK!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642
_______________________________________________
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users