Hi, On 3/11/2007 6:33 PM, Ferdinando Pasqualetti wrote: > > Hi Arno, > I made some tests and this is what I think. > When there is a tape change after an out of space error susequent block > write continue to get that error even after the tape change by the > robot. This continue. > I made some changes to the block.c routine
You'd better discuss this at bacula-devel, I think, or send Kern a mail explaining the problem and the resolution. > (very simple, because I'm not > a C programmer and also I don't know the logic of sd program). I made > the routine enter the retry loop even for ERNOSPC if file number is 0. > This made bacula-sd work correctly (but it took 20 hours to write file > 0). After writing the EOF mark speed is normal again. > My idea is that changing the tape does not reset the EOD condition on > the tape That sounds like a bug, either in the hardware or the HBA driver. > until a file mark is written. I do not know if this a wrong > device or OS error, but I believe that the FD of tape should be closed > and reopened in a tape change. I know about nothing about these details, so I won't comment on it... > dd and mt tests always gave correct results, but dd always write an EOF > mark at the and of the transfer. > > If you have some idea about that I will be very happy. Thank you very > much in any c asze. Difficult problem, I think. If this is a hardware or driver problem, I don't think modifying the SD code is the right solution. If it works for you - fine, but it might be that you have to manage that patch for your installation yourself. Arno > > -------------------------------------------------------------------------- > Ferdinando Pasqualetti > G.T.Dati srl > Tel. 0557310862 - 3356172731 - Fax 055720143 > > > > > > *Ferdinando Pasqualetti/San Lazzaro/Conserve Italia* > > 27/02/2007 09.47 > > > Per > Arno Lehmann <[EMAIL PROTECTED]> > CC > bacula-users <bacula-users@lists.sourceforge.net> > Oggetto > Rif: Re: [Bacula-users] Change tape problem Link > <Notes:///C12563A900369A93/D46731D63F38165B8025651C003EAC4E/56E29AEC82B8E837C125728E006B77F9> > > > > > > > > Hi Arno, > thank you very much for your answer. I will try asap the tests you are > suggesting. By the way, I purged the volumes involved in the error shown > in the original message (it was the third try), restarted the backup job > and here is the (correct) result. > > 25-feb 19:55 bacula-dir: Start Backup JobId 12927, > Job=webfs3-job.2007-02-25_19.55.40 > 25-feb 19:55 bacula-dir: Recycled volume "web-004" > 25-feb 19:55 webfs3: ClientRunBeforeJob: run command "/root/restartsmb" > 25-feb 19:55 webfs3: ClientRunBeforeJob: Shutting down SMB services: [ > OK ] > 25-feb 19:55 webfs3: ClientRunBeforeJob: smbd: nessun processo terminato > 25-feb 19:55 webfs3: ClientRunBeforeJob: smbd: nessun processo terminato > 25-feb 19:55 webfs3: ClientRunBeforeJob: Starting SMB services: [ OK ] > 25-feb 19:55 webfs3: ClientRunBeforeJob: [ OK ] > 25-feb 19:55 bacula-sd: 3307 Issuing autochanger "unload slot 7, drive > 0" command. > 25-feb 19:57 bacula-sd: 3304 Issuing autochanger "load slot 3, drive 0" > command. > 25-feb 19:57 bacula-sd: 3305 Autochanger "load slot 3, drive 0", status > is OK. > 25-feb 19:57 bacula-sd: 3301 Issuing autochanger "loaded? drive 0" command. > 25-feb 19:57 bacula-sd: 3302 Autochanger "loaded? drive 0", result is > Slot 3. > 25-feb 19:57 bacula-sd: Recycled volume "web-004" on device "LTO1" > (/dev/lto1), all previous data lost. > webfs3: /proc is a different filesystem. Will not descend from / > into /proc > webfs3: /boot is a different filesystem. Will not descend from / > into /boot > webfs3: /dev is a different filesystem. Will not descend from / > into /dev > webfs3: /var/lib/nfs/rpc_pipefs is a different filesystem. Will not > descend from / into /var/lib/nfs/rpc_pipefs > webfs3: /sys is a different filesystem. Will not descend from / > into /sys > webfs3: /uno is a different filesystem. Will not descend from / > into /uno > 26-feb 04:14 bacula-sd: End of Volume "web-004" at 594:6519 on device > "LTO1" (/dev/lto1). Write of 64512 bytes got -1. > 26-feb 04:14 bacula-sd: Re-read of last block succeeded. > 26-feb 04:14 bacula-sd: End of medium on Volume "web-004" > Bytes=594,382,602,240 Blocks=9,213,519 at 26-feb-2007 04:14. > 26-feb 04:14 bacula-dir: Recycled volume "web-005" > 26-feb 04:14 bacula-sd: 3301 Issuing autochanger "loaded? drive 0" command. > 26-feb 04:14 bacula-sd: 3302 Autochanger "loaded? drive 0", result is > Slot 3. > 26-feb 04:14 bacula-sd: 3307 Issuing autochanger "unload slot 3, drive > 0" command. > 26-feb 04:15 bacula-sd: 3304 Issuing autochanger "load slot 4, drive 0" > command. > 26-feb 04:15 bacula-sd: 3305 Autochanger "load slot 4, drive 0", status > is OK. > 26-feb 04:15 bacula-sd: 3301 Issuing autochanger "loaded? drive 0" command. > 26-feb 04:15 bacula-sd: 3302 Autochanger "loaded? drive 0", result is > Slot 4. > 26-feb 04:15 bacula-sd: Recycled volume "web-005" on device "LTO1" > (/dev/lto1), all previous data lost. > 26-feb 04:15 bacula-sd: New volume "web-005" mounted on device "LTO1" > (/dev/lto1) at 26-feb-2007 04:15. > 26-feb 10:21 bacula-sd: End of Volume "web-005" at 528:6656 on device > "LTO1" (/dev/lto1). Write of 64512 bytes got -1. > 26-feb 10:21 bacula-sd: Re-read of last block succeeded. > 26-feb 10:21 bacula-sd: End of medium on Volume "web-005" > Bytes=528,395,664,384 Blocks=8,190,656 at 26-feb-2007 10:21. > 26-feb 10:21 bacula-dir: Recycled volume "web-006" > 26-feb 10:21 bacula-sd: 3301 Issuing autochanger "loaded? drive 0" command. > 26-feb 10:21 bacula-sd: 3302 Autochanger "loaded? drive 0", result is > Slot 4. > 26-feb 10:21 bacula-sd: 3307 Issuing autochanger "unload slot 4, drive > 0" command. > 26-feb 10:22 bacula-sd: 3304 Issuing autochanger "load slot 5, drive 0" > command. > 26-feb 10:22 bacula-sd: 3305 Autochanger "load slot 5, drive 0", status > is OK. > 26-feb 10:22 bacula-sd: 3301 Issuing autochanger "loaded? drive 0" command. > 26-feb 10:22 bacula-sd: 3302 Autochanger "loaded? drive 0", result is > Slot 5. > 26-feb 10:23 bacula-sd: Recycled volume "web-006" on device "LTO1" > (/dev/lto1), all previous data lost. > 26-feb 10:23 bacula-sd: New volume "web-006" mounted on device "LTO1" > (/dev/lto1) at 26-feb-2007 10:23. > 26-feb 13:49 bacula-sd: Job write elapsed time = 17:48:45, Transfer rate > = 21.65 M bytes/second > 26-feb 13:49 bacula-sd: Alert: SCSI 2 tape drive: > 26-feb 13:49 bacula-sd: Alert: File number=267, block number=0, partition=0. > 26-feb 13:49 bacula-sd: Alert: Tape block size 0 bytes. Density code > 0x44 (no translation). > 26-feb 13:49 bacula-sd: Alert: Soft error count since last status=0 > 26-feb 13:49 bacula-sd: Alert: General status bits on (81010000): > 26-feb 13:49 bacula-sd: Alert: EOF ONLINE IM_REP_EN > 26-feb 13:49 bacula-dir: Bacula 2.0.2 (28Jan07): 26-feb-2007 13:49:03 > JobId: 12927 > Job: webfs3-job.2007-02-25_19.55.40 > Backup Level: Full > Client: "webfs3" 2.0.2 (28Jan07) > i686-redhat-linux-gnu,redhat,Enterprise release > FileSet: "webfs3-fileset" 2005-04-30 07:13:53 > Pool: "webfs" (From Job resource) > Storage: "LTO-1" (From user selection) > Scheduled time: 25-feb-2007 19:55:17 > Start time: 25-feb-2007 19:55:46 > End time: 26-feb-2007 13:49:03 > Elapsed time: 17 hours 53 mins 17 secs > Priority: 10 > FD Files Written: 4,046,880 > SD Files Written: 4,046,880 > FD Bytes Written: 1,387,910,783,372 (1.387 TB) > SD Bytes Written: 1,388,589,182,436 (1.388 TB) > Rate: 21552.4 KB/s > Software Compression: None > VSS: no > Encryption: no > Volume name(s): web-004|web-005|web-006 > Volume Session Id: 1 > Volume Session Time: 1172427565 > Last Volume Bytes: 266,951,559,168 (266.9 GB) > Non-fatal FD errors: 0 > SD Errors: 0 > FD termination status: OK > SD termination status: OK > Termination: Backup OK > > > The thing that is not in favour of an hardware or OS problem is that > with the same hardware and OS bacula 1.36.3 had not this problem, it > arised with 1.38.11. > The device setup is quite simple: > > > Device { > Name = LTO1 > Media Type = LTO-3 > Archive Device = /dev/lto1 > AutomaticMount = yes; # when device opened, read it > AlwaysOpen = no; > Autoselect = no > RemovableMedia = yes; > RandomAccess = no; > Changer Command = "/etc/bacula/mtx-changer %c %o %S %a %d" > Changer Device = /dev/chg4 > Drive Index = 0 > AutoChanger = yes > Alert Command = "sh -c 'mt -f %a status'" > Maximum Network Buffer Size = 65536 > } > > Devices /dev/lto1 and /dev/chg4 are symlinks to real devices in order to > manage hardware configuration changes. > > Thanks again > > -------------------------------------------------------------------------- > Ferdinando Pasqualetti > G.T.Dati srl > Tel. 0557310862 - 3356172731 - Fax 055720143 > > > > > > *Arno Lehmann <[EMAIL PROTECTED]>* > Inviato da: [EMAIL PROTECTED] > > 26/02/2007 20.33 > > > Per > bacula-users <bacula-users@lists.sourceforge.net> > CC > > Oggetto > Re: [Bacula-users] Change tape problem > > > > > > > > > Hello, > > On 2/26/2007 10:54 AM, Ferdinando Pasqualetti wrote: > > > > Hi Bacula users, > > sorry if you get this message two times, I sent it with a wrong sender > > (not in the list), so I am sending it again. > > I am facing a problem that came out with rev. 1.38.11 (I never saw it > > with 1.36.3). The problem did not happen all times, but very often. Now > > I switched to 2.0.2 and this problem is much more frequent. > > The problem is that when a tape was exhausted bacula changes correctly > > the tape in the autochanger drive but just after get this error: > > > > 25-feb 02:47 bacula-sd: End of Volume "web-004" at 594:3362 on device > > "LTO1" (/dev/lto1). Write of 64512 bytes got -1. > > 25-feb 02:47 bacula-sd: Re-read of last block succeeded. > > 25-feb 02:47 bacula-sd: End of medium on Volume "web-004" > > Bytes=594,178,937,856 Blocks=9,210,362 at 25-feb-2007 02:47. > > 25-feb 02:47 bacula-sd: 3301 Issuing autochanger "loaded? drive 0" > command. > > 25-feb 02:47 bacula-sd: 3302 Autochanger "loaded? drive 0", result is > > Slot 3. > > 25-feb 02:47 bacula-sd: 3307 Issuing autochanger "unload slot 3, drive > > 0" command. > > 25-feb 02:48 bacula-sd: 3304 Issuing autochanger "load slot 4, drive 0" > > command. > > 25-feb 02:48 bacula-sd: 3305 Autochanger "load slot 4, drive 0", status > > is OK. > > 25-feb 02:48 bacula-sd: 3301 Issuing autochanger "loaded? drive 0" > command. > > 25-feb 02:48 bacula-sd: 3302 Autochanger "loaded? drive 0", result is > > Slot 4. > > 25-feb 02:49 bacula-sd: Wrote label to prelabeled Volume "web-005" on > > device "LTO1" (/dev/lto1) > > 25-feb 02:49 bacula-sd: New volume "web-005" mounted on device "LTO1" > > (/dev/lto1) at 25-feb-2007 02:49. > > 25-feb 02:49 bacula-sd: End of Volume "web-005" at 0:1 on device "LTO1" > > (/dev/lto1). Write of 64512 bytes got -1. > > 25-feb 02:49 bacula-sd: webfs3-job.2007-02-24_20.03.22 Error: Re-read of > > last block OK, but block numbers differ. Last block=0 Current > block=9210362. > > 25-feb 02:49 bacula-sd: Job write elapsed time = 06:43:26, Transfer rate > > = 24.52 M bytes/second > > 25-feb 02:49 webfs3: webfs3-job.2007-02-24_20.03.22 Fatal error: > > backup.c:860 Network send error to SD. ERR=Pipe rotta > > 25-feb 02:49 bacula-dir: webfs3-job.2007-02-24_20.03.22 Error: Bacula > > 2.0.2 (28Jan07): 25-feb-2007 02:49:25 > > > > It seems there are two problems, the first one (and the most important > > one) is that bacula get an end of volume on the new tape, > > What Bacula reports as an EOT can be caused by a drive error, too, so > for the time being I assume that the second error is tightly related to > this one. > > > and the second > > one is the difference in the last block (it appears to be the last block > > of the previous tape). > > If that's the case, and your description seems quite clear, you might > have found an OS or hardware bug, too. > > This is only guesswork, but it could be possible that, after a tape > change, the hardware or the tape driver don't update their state > information. > > If that's the case, you could try the following: > - first, have a look at your system log and dmesg output. There might be > errors reported there. > - second, try to reproduce the problem without using Bacula. Unmount the > tape drive from bconsole. Load a tape (an unused one, or one with write > protection). If you use an empty tape, write some data and some file > marks to it, ending with an EOT mark. dd and mt are tools for that purpose. > Then, use tapeinfo or st to observe the tape status, especialy the block > position reported, when doing some rewinds, fast forwards, offline, and > see what happens after you used mtx to unload and reload that tape. > > If there really is a problem with the hardware or the OD driver, you > should be able to reproduce it then. Updating the drive firmware and the > OS (or, if that's up to date, filing a bug report) would be two options > then. > > Otherwise, you should run btape again, because there are some things in > the report I don't like - errors writing the last block to tape should > not happen with current hardware, for example. You might try to tune > your device configuration, and perhaps you'll have to set the tape > driver to a different write mode. Suggesting something is difficult > without seeing how it's setup now :-) > > > Bacula is a MySQL version on a RedHat AS 4.04, rpmbuilt on that system, > > an HP proliant G3 3.2 Ghz, 2Gb RAM. > > The tape is an MSL6000 with two LTO-3 drives, drived by bacula directly > > (not using the autochanger as device - 1.36.3 setup). > > Btape tests run correctly, including the "fill and change tape" (I am > > attaching the test result, if someone is interested). > > Did anyone get a similar problem? > > That basic setup should run ok I think... nothing unusual there. > > Arno > > > > > > -------------------------------------------------------------------------- > > Ferdinando Pasqualetti > > G.T.Dati srl > > Tel. 0557310862 - 3356172731 - Fax 055720143 > > > > > > > > ------------------------------------------------------------------------ > > > > ------------------------------------------------------------------------- > > Take Surveys. Earn Cash. Influence the Future of IT > > Join SourceForge.net's Techsay panel and you'll get the chance to > share your > > opinions on IT & business topics through brief surveys-and earn cash > > http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV > > > > > > ------------------------------------------------------------------------ > > > > _______________________________________________ > > Bacula-users mailing list > > Bacula-users@lists.sourceforge.net > > https://lists.sourceforge.net/lists/listinfo/bacula-users > > -- > IT-Service Lehmann [EMAIL PROTECTED] > Arno Lehmann http://www.its-lehmann.de > > ------------------------------------------------------------------------- > Take Surveys. Earn Cash. Influence the Future of IT > Join SourceForge.net's Techsay panel and you'll get the chance to share your > opinions on IT & business topics through brief surveys-and earn cash > http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV > _______________________________________________ > Bacula-users mailing list > Bacula-users@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/bacula-users > > > > ------------------------------------------------------------------------ > > ------------------------------------------------------------------------- > Take Surveys. Earn Cash. Influence the Future of IT > Join SourceForge.net's Techsay panel and you'll get the chance to share your > opinions on IT & business topics through brief surveys-and earn cash > http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV > > > ------------------------------------------------------------------------ > > _______________________________________________ > Bacula-users mailing list > Bacula-users@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/bacula-users -- IT-Service Lehmann [EMAIL PROTECTED] Arno Lehmann http://www.its-lehmann.de ------------------------------------------------------------------------- Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys-and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV _______________________________________________ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users