This is somewhat of a wild shot, but I recently saw a kernel bug fix against 
the aic7xxx drivers.  If I remember right (I'm a bit hazy on this) there was 
a problem after writing 35GB of data.

On Thursday 06 October 2005 02:12, Joshua Kugler wrote:
> And yet more info (long, with lots of job reports).
>
> This error actually happened with *two* different tapes, and in both cases,
> when they errored, they had the same "last volume bytes" value.
>
> This seems to be tied to network errors somehow.  Observer:
>
> The job before this one loaded tape 0009.  It writes to tape nine, and has
> a last volume bytes value of 15,747,641,980.
>
> The next job runs, but errors out due to a network glitch.
>
> 02-Oct 01:05 herodotus-dir: Start Backup JobId 6870,
> Job=fiscalpro.2005-10-02_01.05.20
> 02-Oct 01:05 herodotus-sd: Spooling data ...
> 02-Oct 03:02 herodotus-sd: Writing spooled data to Volume. Despooling
> 71,612,997,031 bytes ...
> 02-Oct 05:16 herodotus-dir: fiscalpro.2005-10-02_01.05.20 Fatal error:
> Network error with FD during Backup: ERR=Connection timed out
> 02-Oct 05:17 herodotus-dir: fiscalpro.2005-10-02_01.05.20 Fatal error: No
> Job status returned from FD.
> 02-Oct 05:27 herodotus-dir: fiscalpro.2005-10-02_01.05.20 Error: Bacula
> 1.36.1 (26Nov04): 02-Oct-2005 05:17:09
>   JobId:                  6870
>   Job:                    fiscalpro.2005-10-02_01.05.20
>   Backup Level:           Full
>   Client:                 fiscalpro-fd
>   FileSet:                "wxpDefault" 2005-01-10 15:27:47
>   Pool:                   "Default"
>   Storage:                "herodotus-sd"
>   Start time:             02-Oct-2005 01:05:21
>   End time:               02-Oct-2005 05:17:09
>   FD Files Written:       0
>   SD Files Written:       0
>   FD Bytes Written:       0
>   SD Bytes Written:       0
>   Rate:                   0.0 KB/s
>   Software Compression:   None
>   Volume name(s):
>   Volume Session Id:      674
>   Volume Session Time:    1126137632
>   Last Volume Bytes:      51,747,561,167
>   Non-fatal FD errors:    0
>   SD Errors:              0
>   FD termination status:  Error
>   SD termination status:  Running
>   Termination:            *** Backup Error ***
>
> Notice the last volume bytes value.
>
> Then, the next job (also errors, but with the EOM error message):
>
> 02-Oct 01:05 herodotus-dir: Start Backup JobId 6874,
> Job=webdev1.2005-10-02_01.05.24
> 02-Oct 01:05 herodotus-sd: Spooling data ...
> 02-Oct 03:02 herodotus-sd: Writing spooled data to Volume. Despooling
> 71,612,545,363 bytes ...
> 02-Oct 04:48 herodotus-sd: Writing spooled data to Volume. Despooling
> 71,613,118,263 bytes ...
> 02-Oct 06:15 herodotus-sd: webdev1.2005-10-02_01.05.24 Fatal error: Cannot
> write block. Device at EOM.
> 02-Oct 06:15 herodotus-sd: webdev1.2005-10-02_01.05.24 Fatal error: Fatal
> despooling error.02-Oct 06:15 herodotus-sd: webdev1.2005-10-02_01.05.24
> Fatal error: Fatal device error: ERR=
> 02-Oct 06:15 webdev1-fd: webdev1.2005-10-02_01.05.24 Fatal error:
> backup.c:472 Network send error 32768 to SD. ERR=Broken pipe
> 02-Oct 06:15 herodotus-dir: webdev1.2005-10-02_01.05.24 Error: Bacula
> 1.36.1 (26Nov04): 02-Oct-2005 06:15:51
>   JobId:                  6874
>   Job:                    webdev1.2005-10-02_01.05.24
>   Backup Level:           Full
>   Client:                 webdev1-fd
>   FileSet:                "webdev1-fileset" 2005-03-11 11:28:01
>   Pool:                   "Default"
>   Storage:                "herodotus-sd"
>   Start time:             02-Oct-2005 01:05:25
>   End time:               02-Oct-2005 06:15:51
>   FD Files Written:       3,424
>   SD Files Written:       3,424
>   FD Bytes Written:       37,355,359,925
>   SD Bytes Written:       37,355,770,896
>   Rate:                   2005.5 KB/s
>   Software Compression:   None
>   Volume name(s):         BaculaTape-0009-20041213
>   Volume Session Id:      678
>   Volume Session Time:    1126137632
>   Last Volume Bytes:      12,999,102,895
>   Non-fatal FD errors:    0
>   SD Errors:              0
>   FD termination status:  Error
>   SD termination status:  Error
>   Termination:            *** Backup Error ***
>
>
> Fatal error, and Last Volume Bytes (LVB) less than before.
>
> Next job:
>
> 02-Oct 01:05 herodotus-dir: Start Backup JobId 6867,
> Job=textbook3.2005-10-02_01.05.17
> 02-Oct 01:05 herodotus-sd: Spooling data ...
> 02-Oct 03:02 herodotus-sd: Writing spooled data to Volume. Despooling
> 71,612,803,459 bytes ...
> 02-Oct 06:15 herodotus-sd: textbook3.2005-10-02_01.05.17 Fatal error:
> Cannot write block. Device at EOM.
> 02-Oct 06:15 herodotus-sd: textbook3.2005-10-02_01.05.17 Fatal error: Fatal
> despooling error.02-Oct 06:15 herodotus-sd: textbook3.2005-10-02_01.05.17
> Fatal error: Fatal device error: ERR=
> 02-Oct 06:15 textbook3-fd: textbook3.2005-10-02_01.05.17 Fatal error: ..
> \filed\../../filed/backup.c:472 Network send error 32768 to SD. ERR=An
> existing connection was forcibly closed by the remote host.
>
> And same LVB.
>
> Another job with EOM error, and same LVB.  Then a job that succeeds, and
> writes, but same LVB.
>
> Then I get a job that changes tape (0009 to 0010), but then fails with the
> EOM error, and ends up with the same LVB value.
>
> 02-Oct 01:05 herodotus-dir: Start Backup JobId 6868,
> Job=textbook4.2005-10-02_01.05.18
> 02-Oct 01:05 herodotus-sd: Spooling data ...
> 02-Oct 03:02 herodotus-sd: Writing spooled data to Volume. Despooling
> 71,612,674,411 bytes ...
> 02-Oct 05:13 herodotus-sd: End of Volume "BaculaTape-0009-20041213" at
> 47:819 on device /dev/nst0. Write of 64512 bytes got -1.
> 02-Oct 05:14 herodotus-sd: Re-read of last block succeeded.
> 02-Oct 05:14 herodotus-sd: End of medium on Volume
> "BaculaTape-0009-20041213" Bytes=47,049,606,471 Blocks=729,319 at
> 02-Oct-2005 05:14.
> 02-Oct 05:14 herodotus-dir: Purging oldest volume
> "BaculaTape-0010-20041213" 02-Oct 05:27 herodotus-dir: 215 Files on Volume
> "BaculaTape-0010-20041213" purged from catalog.
> 02-Oct 05:27 herodotus-dir: There are no more Jobs associated with Volume
> "BaculaTape-0010-20041213". Marking it purged.
> 02-Oct 05:27 herodotus-sd: 3301 Issuing autochanger "loaded drive 0"
> command. 02-Oct 05:27 herodotus-sd: 3302 Autochanger "loaded drive 0",
> result is Slot 9.
> 02-Oct 05:27 herodotus-sd: 3303 Issuing autochanger "unload slot 9, drive
> 0" command.
> 02-Oct 05:27 herodotus-sd: 3304 Issuing autochanger "load slot 10, drive 0"
> command.
> 02-Oct 05:30 herodotus-sd: 3305 Autochanger "load slot 10, drive 0", status
> is OK.
> 02-Oct 05:30 herodotus-sd: Recycled volume "BaculaTape-0010-20041213" on
> device "/dev/nst0", all previous data lost.
> 02-Oct 05:30 herodotus-sd: New volume "BaculaTape-0010-20041213" mounted on
> device /dev/nst0 at 02-Oct-2005 05:30.
> 02-Oct 05:36 textbook4-fd:      Could not stat c:/Documents and
> Settings/Michael Lovecraft/My Documents/My Music/iTunes/iTunes Music/Ozzy
> Osbourne/Prince Of Darkness [Disc 2]: ERR=The system cannot find the file
> specified.
>
> 02-Oct 05:44 herodotus-sd: Committing spooled data to Volume. Despooling
> 70,261,901,515 bytes ...
> 02-Oct 06:16 herodotus-sd: textbook4.2005-10-02_01.05.18 Fatal error:
> Cannot write block. Device at EOM.
> 02-Oct 06:16 herodotus-sd: Sending spooled attrs to the Director.
> Despooling 1,590,123 bytes ...
> 02-Oct 06:17 textbook4-fd: textbook4.2005-10-02_01.05.18 Fatal error: Bad
> status 102 returned from Storage Daemon.
> 02-Oct 06:17 herodotus-dir: textbook4.2005-10-02_01.05.18 Error: Bacula
> 1.36.1 (26Nov04): 02-Oct-2005 06:17:24
>   JobId:                  6868
>   Job:                    textbook4.2005-10-02_01.05.18
>   Backup Level:           Full
>   Client:                 textbook4-fd
>   FileSet:                "wxpDefault" 2005-01-10 15:27:47
>   Pool:                   "Default"
>   Storage:                "herodotus-sd"
>   Start time:             02-Oct-2005 01:05:19
>   End time:               02-Oct-2005 06:17:24
>   FD Files Written:       4,023
>   SD Files Written:       4,023
>   FD Bytes Written:       8,942,960,505
>   SD Bytes Written:       8,943,908,548
>   Rate:                   477.6 KB/s
>   Software Compression:   None
>   Volume name(s):         BaculaTape-0010-20041213|BaculaTape-0009-20041213
>   Volume Session Id:      672
>   Volume Session Time:    1126137632
>   Last Volume Bytes:      12,999,102,895
>   Non-fatal FD errors:    1
>   SD Errors:              0
>   FD termination status:  Error
>   SD termination status:  Error
>   Termination:            *** Backup Error ***
>
> Then I have a few more jobs that succeed, but the LVB value stays the same,
>
> Then a few more jobs that fail, interspersed by success (LVB always the
> same, mind you).
>
> Then, today, I ejected 0010, put in 0011, ran a backup.  The server got
> 0010 again, ran the backup, and succeeded.  And LVB incremented too.
>
> Any ideas, tips, suggestions, etc?
>
> j----- k-----

-- 
Best regards,

Kern

  (">
  /\
  V_V


-------------------------------------------------------
This SF.Net email is sponsored by:
Power Architecture Resource Center: Free content, downloads, discussions,
and more. http://solutions.newsforge.com/ibmarch.tmpl
_______________________________________________
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

Reply via email to