Hello,

If you are running Bacula 9.0.0 or later, please be sure that you have implemented the new tape alert code.  It is documented in the New Features chapter of the manual and is in the bacula-sd.conf file that is distributed with the source code.  This new code automatically checks for tape alerts (by calling the tapealert script provided by Bacula).  You can then check on the alerts with the bconsole status command.

Tape alerts can often give very precise information on current or even pending tape problems.

Best regards,
Kern

On 09/21/2018 01:23 PM, Martin Simmons wrote:
This looks like a problem reported by the drive or the tape.  I suggest
setting the "Alert Command" option in bacula-sd.conf to run something like

/usr/sbin/smartctl -a /dev/nst0 -T verypermissive

(or use the tapeinfo program) to check for TapeAlert messages after every
backup.

__Martin


On Mon, 17 Sep 2018 17:37:12 +0100, Kevin Hodges said:
Martin

    found the following around the same time:


Sep  9 10:54:58 swlx1 kernel: st 1:0:0:0: [st0] Sense Key : Medium Error 
[deferred]
Sep  9 10:54:58 swlx1 kernel: st 1:0:0:0: [st0] Add. Sense: Write append error
Sep  9 10:54:59 swlx1 kernel: st 1:0:0:0: [st0] Sense Key : Medium Error 
[current]
Sep  9 10:54:59 swlx1 kernel: st 1:0:0:0: [st0] Add. Sense: Write append error
Sep  9 10:55:00 swlx1 kernel: st 1:0:0:0: [st0] Sense Key : Medium Error 
[current]
Sep  9 10:55:00 swlx1 kernel: st 1:0:0:0: [st0] Add. Sense: Write append error
Sep  9 10:55:00 swlx1 kernel: st 1:0:0:0: [st0] Sense Key : Medium Error 
[current]
Sep  9 10:55:00 swlx1 kernel: st 1:0:0:0: [st0] Add. Sense: Write append error
Sep  9 10:55:01 swlx1 kernel: st 1:0:0:0: [st0] Sense Key : Medium Error 
[current]
Sep  9 10:55:01 swlx1 kernel: st 1:0:0:0: [st0] Add. Sense: Write append error

Kevin

On Mon, 2018-09-17 at 17:28 +0100, Martin Simmons wrote:
The "ERR=Input/output error" can be caused by hardware problems, but
I would
not expect it from a network problem.  If you have the syslog
(e.g. /var/log/messages) from that time, I would check for errors
there too.

__Martin


On Mon, 17 Sep 2018 14:30:16 +0100, Kevin Hodges said:
hi Martin

    found this in the log:

Writing spooled data to Volume. Despooling 50,000,033,271 bytes ...
09-Sep 10:55 swlx1.rdg.ac.uk-sd2 JobId 36: Error: block.c:255 Write
error at 1512:13125 on device "LTO-8" (/dev/nst0). ERR=Input/output
error.
09-Sep 10:55 swlx1.rdg.ac.uk-sd2 JobId 36: Error: Error writing
final
EOF to tape. This Volume may not be readable.
tape_dev.c:941 ioctl MTWEOF error on "LTO-8" (/dev/nst0).
ERR=Input/output error.

I've restarted the backup from scratch and so far it seems to have
got
past the same point of failure that occured last time, so fingers
crossed. There was some network issues around the time of the
failure!

Regards

Kevin

On Mon, 2018-09-17 at 14:13 +0100, Martin Simmons wrote:
When the director stopped at ~1.5TB, did it report any other
messages
(e.g. I/O errors)?

I suggest looking in the system logs / console for messages
around
that time
as well.

__Martin


On Tue, 11 Sep 2018 10:30:31 +0100, Kevin Hodges said:
hi

    I came across a problem recently after installing a new
single
tape
drive for backups. This is a HPE LTO-8 Ultrium machine
connected to
a
Redhat linux box: Linux swlx1.rdg.ac.uk 3.10.0-
862.9.1.el7.x86_64

The problem occured whilst performing a backup that consists of
several
millions of files which are several TB in total size. The
backup
stopped after writing ~1.5TB with the director reporting the
volume
was
full and asking for a new labelled volume. LTO-8 should take at
least
12TB (native). This was a surprise but I thought it might be a
tape
problem so I unmounted the tape and tried to load a new tape to
label
it and mount it to continue but I could not load the new blank
tape.
It seemed like the machine continually tried to load the tape
without
success and I had to keep pressing the eject button to extract
the
tape.

Thinking this might be a hardware problem I stopped the backup
shutdown
the bacula daemons and ran all the vendor tests which came back
as
reporting no errors. On restarting the bacula daemons I found I
was
able to load the tapes again and re-start the backup.

So my question is if this is not a hardware or tape problem
what
prevents me loading a new tape and labelling during an ongoing
backup
job, is there some way to pause the backup to allow a new tape
to
be
labelled?

My storage config is:

Device {
   Name = LTO-8
   Media Type = LTO-8
   Archive Device = /dev/nst0
   AutomaticMount = yes;
   AlwaysOpen = yes;
   RemovableMedia = yes;
   RandomAccess = no;
   AutoChanger = no
   Spool Directory = /opt/bacula/working2
   Maximum Spool Size = 100GB
   Maximum Job Spool Size  = 50GB
}

Should the AutomaticMount be set to 'no' to stop attempts to
automatically mount any new tape even if it is not labelled?

The issue of the tape being labelled full well before its
capacity
is
still a mystery.

Thanks for any help

Kevin

_______________________________________________
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users



_______________________________________________
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

Reply via email to