Hello,

Thanks for being a long term Bacula users.  I agree with Andrew, it looks like it may be a media problem rather than a drive problem.  When reviewing Bacula output, I like to see the full job log including the Bacula sign on line to be sure what version it is.

In your case, particularly if this is a stand-alone drive, I would *strongly* recommend upgrading to Bacula version 9.0.5, which was just released, because it has some important fixes that correct Bacula behavior with stand-alone drives (as opposed to autochangers).

Also, if you continue to have errors after trying a different cassette, it would be useful to see your Storage Daemon Device resource to see what you are calling to get the tape alert messages.  The code between older Bacula's and version 9.0.x changed rather significantly in that area, and you really need the new Device configurations (bacula-sd.conf) for getting correct Alerts.  From what I see, it looks like you have it setup correctly but it is worth verifying.

Best regards,

Kern


On 11/04/2017 11:07 PM, Jaime Ferrer wrote:

Hi;

 

                I have been working with bacula for almost 10 years. It’s a great program and I’ve installed in several servers without issues, Also It has saved me several times! 😉 . Recently I’ve been preparing a Bacula server with the latest version 9.0.4 with a HP Ultrium 7 tape drive. After running successfully the btape test run, I performed the first backup and restore to test the unit. During the first backup its fails with the following error:

 

JobId 28: Elapsed time=00:04:54, Transfer rate=20.13 M Bytes/second 30-Oct 13:16 local-sd JobId 28: Fatal error: Alert: Volume="TEST3" alert=3: ERR=The operation has stopped because an error has occurred while reading or writing data which the drive cannot correct. The drive had a hard read or write error 30-Oct 13:16 local-sd JobId 28: Fatal error: Alert: Volume="TEST3" alert=5: ERR=The tape is damaged or the drive is faulty. Call the tape drive supplier helpline.  The drive can no longer read data from the tape 30-Oct 13:16 local-sd JobId 28: Warning: Disabled Device "LTO-7" (/dev/nst0) due to tape alert=39.

local-sd JobId 28: Warning: Alert: Volume="TEST3" alert=39: ERR=The tape drive may have a fault. Check for availability of diagnostic information and run extended diagnostics if applicable.   The drive may have had a failure which may be identified by stored diagnostic information or by running extended diagnostics (eg Send Diagnostic). Check the tape drive users manual for instructions on running extended diagnostic tests and retrieving diagnostic data.

FileSet:                "Full Set" 2017-09-29 19:01:12

  Pool:                   "TEST" (From Job resource)

  Catalog:                "MyCatalog" (From Client resource)

  Storage:                "LTO-7" (From Pool resource)

  Scheduled time:         30-Oct-2017 13:10:55

  Start time:             30-Oct-2017 13:11:03

  End time:               30-Oct-2017 13:16:12

  Elapsed time:           5 mins 9 secs

  Priority:               11

  FD Files Written:       190,214

  SD Files Written:       190,214

  FD Bytes Written:       5,891,151,955 (5.891 GB)

  SD Bytes Written:       5,919,950,296 (5.919 GB)

  Rate:                   19065.2 KB/s

  Software Compression:   None

  Comm Line Compression:  None

  Snapshot/VSS:           no

  Encryption:             no

  Accurate:               no

  Volume name(s):         TEST3

  Volume Session Id:      1

  Volume Session Time:    1509379838

  Last Volume Bytes:      17,790,732,288 (17.79 GB)

  Non-fatal FD errors:    0

  SD Errors:              1

  FD termination status:  OK

  SD termination status:  Error

  Termination:            *** Backup Error ***

 

 

As the error implies I ran the HP LTT tools but as far as I went all drive diagnostics seems to be ok.

 

Looking into the failed backup job, it’s seems that the job failed just after files are backed up, since all test jobs have the same size, around 5GB which is the size of the test files. Also I can restore it back but finished with the same error. The files are not stored in the catalog since the backup job fails, so I had to restore all files at once.

 

30-Oct 13:20 local-dir JobId 29: Using Device "LTO-7" to read.

30-Oct 13:20  local-sd JobId 29: Ready to read from volume "TEST3" on Tape device "LTO-7" (/dev/nst0).

30-Oct 13:20  local-sd JobId 29: Forward spacing Volume "TEST3" to addr=2:0 30-Oct 13:22  local-sd JobId 29: Elapsed time=00:01:37, Transfer rate=61.03 M Bytes/second 30-Oct 13:22  local-sd JobId 29: Fatal error: Alert: Volume="TEST3" alert=3: ERR=The operation has stopped because an error has occurred while reading or writing data which the drive cannot correct. The drive had a hard read or write error 30-Oct 13:22  .local-sd JobId 29: Fatal error: Alert: Volume="TEST3" alert=5: ERR=The tape is damaged or the drive is faulty. Call the tape drive supplier helpline.  The drive can no longer read data from the tape 30-Oct 13: .local-sd JobId 29: Warning: Disabled Device "LTO-7" (/dev/nst0) due to tape alert=39.

30-Oct 13:22  .local-sd JobId 29: Warning: Alert: Volume="TEST3" alert=39: ERR=The tape drive may have a fault. Check for availability of diagnostic information and run extended diagnostics if applicable.   The drive may have had a failure which may be identified by stored diagnostic information or by running extended diagnostics (eg Send Diagnostic). Check the tape drive users manual for instructions on running extended diagnostic tests and retrieving diagnostic data.

30-Oct 13:22 APSSCL0SRV010.apsa.local-dir JobId 29: Error: Bacula  .local-dir 9.0.4 (06Sep17):

  Build OS:               x86_64-pc-linux-gnu redhat (Core)

  JobId:                  29

  Job:                    RestoreFiles.2017-10-30_13.19.42_04

  Restore Client:          .local-fd

  Start time:             30-Oct-2017 13:19:44

  End time:               30-Oct-2017 13:22:23

  Files Expected:         190,214

  Files Restored:         190,214

  Bytes Restored:         5,891,133,870

  Rate:                   37051.2 KB/s

  FD Errors:              0

  FD termination status:  OK

  SD termination status:  Error

  Termination:            *** Restore Error ***

 

Testing a HP Ultirum 6 unit, Bacula works flawlessly.

 

So I’m starting to doubt about this unit despite its diagnostics seems to be ok. Also I will upgrade the firmware and also driver from HPe site, and test. Also perform a tar test round.

 

But meanwhile, I’m wondering if one of you have experienced something like this? Are LTO-7 compatible with Bacula? Are there some special parameters/config for this units (LTO-7) ?

 

Thanks in advance.

 

 

 



------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot


_______________________________________________
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users



------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

Reply via email to