On Wednesday 05 October 2005 23:18, Kern Sibbald wrote: > This is somewhat of a wild shot, but I recently saw a kernel bug fix > against the aic7xxx drivers. If I remember right (I'm a bit hazy on this) > there was a problem after writing 35GB of data.
According to lspcidrake (Mandrake's lspci extension): aic7xxx : Adaptec|7899P [STORAGE_SCSI] (vendor:9005 device:00cf subv:1028 subd:00d9) So, I'll see about upgrading the kernel. Thanks for the tip. Arno also suspected a hardware issue, so I'll see what I can track down. What *is* odd is that after I ejected the tape, and let Bacula reload the tape, all worked fine, and all my backups ran just fine last night...so... j----- k----- > > On Thursday 06 October 2005 02:12, Joshua Kugler wrote: > > And yet more info (long, with lots of job reports). > > > > This error actually happened with *two* different tapes, and in both > > cases, when they errored, they had the same "last volume bytes" value. > > > > This seems to be tied to network errors somehow. Observer: > > > > The job before this one loaded tape 0009. It writes to tape nine, and > > has a last volume bytes value of 15,747,641,980. > > > > The next job runs, but errors out due to a network glitch. > > > > 02-Oct 01:05 herodotus-dir: Start Backup JobId 6870, > > Job=fiscalpro.2005-10-02_01.05.20 > > 02-Oct 01:05 herodotus-sd: Spooling data ... > > 02-Oct 03:02 herodotus-sd: Writing spooled data to Volume. Despooling > > 71,612,997,031 bytes ... > > 02-Oct 05:16 herodotus-dir: fiscalpro.2005-10-02_01.05.20 Fatal error: > > Network error with FD during Backup: ERR=Connection timed out > > 02-Oct 05:17 herodotus-dir: fiscalpro.2005-10-02_01.05.20 Fatal error: No > > Job status returned from FD. > > 02-Oct 05:27 herodotus-dir: fiscalpro.2005-10-02_01.05.20 Error: Bacula > > 1.36.1 (26Nov04): 02-Oct-2005 05:17:09 > > JobId: 6870 > > Job: fiscalpro.2005-10-02_01.05.20 > > Backup Level: Full > > Client: fiscalpro-fd > > FileSet: "wxpDefault" 2005-01-10 15:27:47 > > Pool: "Default" > > Storage: "herodotus-sd" > > Start time: 02-Oct-2005 01:05:21 > > End time: 02-Oct-2005 05:17:09 > > FD Files Written: 0 > > SD Files Written: 0 > > FD Bytes Written: 0 > > SD Bytes Written: 0 > > Rate: 0.0 KB/s > > Software Compression: None > > Volume name(s): > > Volume Session Id: 674 > > Volume Session Time: 1126137632 > > Last Volume Bytes: 51,747,561,167 > > Non-fatal FD errors: 0 > > SD Errors: 0 > > FD termination status: Error > > SD termination status: Running > > Termination: *** Backup Error *** > > > > Notice the last volume bytes value. > > > > Then, the next job (also errors, but with the EOM error message): > > > > 02-Oct 01:05 herodotus-dir: Start Backup JobId 6874, > > Job=webdev1.2005-10-02_01.05.24 > > 02-Oct 01:05 herodotus-sd: Spooling data ... > > 02-Oct 03:02 herodotus-sd: Writing spooled data to Volume. Despooling > > 71,612,545,363 bytes ... > > 02-Oct 04:48 herodotus-sd: Writing spooled data to Volume. Despooling > > 71,613,118,263 bytes ... > > 02-Oct 06:15 herodotus-sd: webdev1.2005-10-02_01.05.24 Fatal error: > > Cannot write block. Device at EOM. > > 02-Oct 06:15 herodotus-sd: webdev1.2005-10-02_01.05.24 Fatal error: Fatal > > despooling error.02-Oct 06:15 herodotus-sd: webdev1.2005-10-02_01.05.24 > > Fatal error: Fatal device error: ERR= > > 02-Oct 06:15 webdev1-fd: webdev1.2005-10-02_01.05.24 Fatal error: > > backup.c:472 Network send error 32768 to SD. ERR=Broken pipe > > 02-Oct 06:15 herodotus-dir: webdev1.2005-10-02_01.05.24 Error: Bacula > > 1.36.1 (26Nov04): 02-Oct-2005 06:15:51 > > JobId: 6874 > > Job: webdev1.2005-10-02_01.05.24 > > Backup Level: Full > > Client: webdev1-fd > > FileSet: "webdev1-fileset" 2005-03-11 11:28:01 > > Pool: "Default" > > Storage: "herodotus-sd" > > Start time: 02-Oct-2005 01:05:25 > > End time: 02-Oct-2005 06:15:51 > > FD Files Written: 3,424 > > SD Files Written: 3,424 > > FD Bytes Written: 37,355,359,925 > > SD Bytes Written: 37,355,770,896 > > Rate: 2005.5 KB/s > > Software Compression: None > > Volume name(s): BaculaTape-0009-20041213 > > Volume Session Id: 678 > > Volume Session Time: 1126137632 > > Last Volume Bytes: 12,999,102,895 > > Non-fatal FD errors: 0 > > SD Errors: 0 > > FD termination status: Error > > SD termination status: Error > > Termination: *** Backup Error *** > > > > > > Fatal error, and Last Volume Bytes (LVB) less than before. > > > > Next job: > > > > 02-Oct 01:05 herodotus-dir: Start Backup JobId 6867, > > Job=textbook3.2005-10-02_01.05.17 > > 02-Oct 01:05 herodotus-sd: Spooling data ... > > 02-Oct 03:02 herodotus-sd: Writing spooled data to Volume. Despooling > > 71,612,803,459 bytes ... > > 02-Oct 06:15 herodotus-sd: textbook3.2005-10-02_01.05.17 Fatal error: > > Cannot write block. Device at EOM. > > 02-Oct 06:15 herodotus-sd: textbook3.2005-10-02_01.05.17 Fatal error: > > Fatal despooling error.02-Oct 06:15 herodotus-sd: > > textbook3.2005-10-02_01.05.17 Fatal error: Fatal device error: ERR= > > 02-Oct 06:15 textbook3-fd: textbook3.2005-10-02_01.05.17 Fatal error: .. > > \filed\../../filed/backup.c:472 Network send error 32768 to SD. ERR=An > > existing connection was forcibly closed by the remote host. > > > > And same LVB. > > > > Another job with EOM error, and same LVB. Then a job that succeeds, and > > writes, but same LVB. > > > > Then I get a job that changes tape (0009 to 0010), but then fails with > > the EOM error, and ends up with the same LVB value. > > > > 02-Oct 01:05 herodotus-dir: Start Backup JobId 6868, > > Job=textbook4.2005-10-02_01.05.18 > > 02-Oct 01:05 herodotus-sd: Spooling data ... > > 02-Oct 03:02 herodotus-sd: Writing spooled data to Volume. Despooling > > 71,612,674,411 bytes ... > > 02-Oct 05:13 herodotus-sd: End of Volume "BaculaTape-0009-20041213" at > > 47:819 on device /dev/nst0. Write of 64512 bytes got -1. > > 02-Oct 05:14 herodotus-sd: Re-read of last block succeeded. > > 02-Oct 05:14 herodotus-sd: End of medium on Volume > > "BaculaTape-0009-20041213" Bytes=47,049,606,471 Blocks=729,319 at > > 02-Oct-2005 05:14. > > 02-Oct 05:14 herodotus-dir: Purging oldest volume > > "BaculaTape-0010-20041213" 02-Oct 05:27 herodotus-dir: 215 Files on > > Volume "BaculaTape-0010-20041213" purged from catalog. > > 02-Oct 05:27 herodotus-dir: There are no more Jobs associated with Volume > > "BaculaTape-0010-20041213". Marking it purged. > > 02-Oct 05:27 herodotus-sd: 3301 Issuing autochanger "loaded drive 0" > > command. 02-Oct 05:27 herodotus-sd: 3302 Autochanger "loaded drive 0", > > result is Slot 9. > > 02-Oct 05:27 herodotus-sd: 3303 Issuing autochanger "unload slot 9, drive > > 0" command. > > 02-Oct 05:27 herodotus-sd: 3304 Issuing autochanger "load slot 10, drive > > 0" command. > > 02-Oct 05:30 herodotus-sd: 3305 Autochanger "load slot 10, drive 0", > > status is OK. > > 02-Oct 05:30 herodotus-sd: Recycled volume "BaculaTape-0010-20041213" on > > device "/dev/nst0", all previous data lost. > > 02-Oct 05:30 herodotus-sd: New volume "BaculaTape-0010-20041213" mounted > > on device /dev/nst0 at 02-Oct-2005 05:30. > > 02-Oct 05:36 textbook4-fd: Could not stat c:/Documents and > > Settings/Michael Lovecraft/My Documents/My Music/iTunes/iTunes Music/Ozzy > > Osbourne/Prince Of Darkness [Disc 2]: ERR=The system cannot find the file > > specified. > > > > 02-Oct 05:44 herodotus-sd: Committing spooled data to Volume. Despooling > > 70,261,901,515 bytes ... > > 02-Oct 06:16 herodotus-sd: textbook4.2005-10-02_01.05.18 Fatal error: > > Cannot write block. Device at EOM. > > 02-Oct 06:16 herodotus-sd: Sending spooled attrs to the Director. > > Despooling 1,590,123 bytes ... > > 02-Oct 06:17 textbook4-fd: textbook4.2005-10-02_01.05.18 Fatal error: Bad > > status 102 returned from Storage Daemon. > > 02-Oct 06:17 herodotus-dir: textbook4.2005-10-02_01.05.18 Error: Bacula > > 1.36.1 (26Nov04): 02-Oct-2005 06:17:24 > > JobId: 6868 > > Job: textbook4.2005-10-02_01.05.18 > > Backup Level: Full > > Client: textbook4-fd > > FileSet: "wxpDefault" 2005-01-10 15:27:47 > > Pool: "Default" > > Storage: "herodotus-sd" > > Start time: 02-Oct-2005 01:05:19 > > End time: 02-Oct-2005 06:17:24 > > FD Files Written: 4,023 > > SD Files Written: 4,023 > > FD Bytes Written: 8,942,960,505 > > SD Bytes Written: 8,943,908,548 > > Rate: 477.6 KB/s > > Software Compression: None > > Volume name(s): > > BaculaTape-0010-20041213|BaculaTape-0009-20041213 Volume Session Id: > > 672 > > Volume Session Time: 1126137632 > > Last Volume Bytes: 12,999,102,895 > > Non-fatal FD errors: 1 > > SD Errors: 0 > > FD termination status: Error > > SD termination status: Error > > Termination: *** Backup Error *** > > > > Then I have a few more jobs that succeed, but the LVB value stays the > > same, > > > > Then a few more jobs that fail, interspersed by success (LVB always the > > same, mind you). > > > > Then, today, I ejected 0010, put in 0011, ran a backup. The server got > > 0010 again, ran the backup, and succeeded. And LVB incremented too. > > > > Any ideas, tips, suggestions, etc? > > > > j----- k----- -- Joshua Kugler CDE System Administrator http://distance.uaf.edu/ ------------------------------------------------------- This SF.Net email is sponsored by: Power Architecture Resource Center: Free content, downloads, discussions, and more. http://solutions.newsforge.com/ibmarch.tmpl _______________________________________________ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users