Hi,

On 3/11/2007 6:33 PM, Ferdinando Pasqualetti wrote:
> 
> Hi Arno,
> I made some tests and this is what I think.
> When there is a tape change after an out of space error susequent block 
> write continue to get that error even after the tape change by the 
> robot. This continue.
> I made some changes to the block.c routine

You'd better discuss this at bacula-devel, I think, or send Kern a mail 
explaining the problem and the resolution.

> (very simple, because I'm not 
> a C programmer and also I don't know the logic of sd program). I made 
> the routine enter the retry loop even for ERNOSPC if file number is 0. 
> This made bacula-sd work correctly (but it took 20 hours to write file 
> 0). After writing the EOF mark speed is normal again.
> My idea is that changing the tape does not reset the EOD condition on 
> the tape

That sounds like a bug, either in the hardware or the HBA driver.

> until a file mark is written. I do not know if this a wrong 
> device or OS error, but I believe that the FD of tape should be closed 
> and reopened in a tape change.

I know about nothing about these details, so I won't comment on it...

> dd and mt tests always gave correct results, but dd always write an EOF 
> mark at the and of the transfer.
> 
> If you have some idea about that I will be very happy. Thank you very 
> much in any c asze.

Difficult problem, I think.

If this is a hardware or driver problem, I don't think modifying the SD 
code is the right solution.

If it works for you - fine, but it might be that you have to manage that 
patch for your installation yourself.

Arno

>  
> --------------------------------------------------------------------------
> Ferdinando Pasqualetti
> G.T.Dati srl
> Tel. 0557310862 - 3356172731 - Fax 055720143
> 
> 
> 
> 
> 
> *Ferdinando Pasqualetti/San Lazzaro/Conserve Italia*
> 
> 27/02/2007 09.47
> 
>       
> Per
>       Arno Lehmann <[EMAIL PROTECTED]>
> CC
>       bacula-users <bacula-users@lists.sourceforge.net>
> Oggetto
>       Rif: Re: [Bacula-users] Change tape problem Link 
> <Notes:///C12563A900369A93/D46731D63F38165B8025651C003EAC4E/56E29AEC82B8E837C125728E006B77F9>
> 
> 
>       
> 
> 
> 
> 
> Hi Arno,
> thank you very much for your answer. I will try asap the tests you are 
> suggesting. By the way, I purged the volumes involved in the error shown 
> in the original message (it was the third try), restarted the backup job 
> and here is the (correct) result.
> 
> 25-feb 19:55 bacula-dir: Start Backup JobId 12927, 
> Job=webfs3-job.2007-02-25_19.55.40
> 25-feb 19:55 bacula-dir: Recycled volume "web-004"
> 25-feb 19:55 webfs3: ClientRunBeforeJob: run command "/root/restartsmb"
> 25-feb 19:55 webfs3: ClientRunBeforeJob: Shutting down SMB services: [ 
>  OK  ]
> 25-feb 19:55 webfs3: ClientRunBeforeJob: smbd: nessun processo terminato
> 25-feb 19:55 webfs3: ClientRunBeforeJob: smbd: nessun processo terminato
> 25-feb 19:55 webfs3: ClientRunBeforeJob: Starting SMB services: [  OK  ]
> 25-feb 19:55 webfs3: ClientRunBeforeJob: [  OK  ]
> 25-feb 19:55 bacula-sd: 3307 Issuing autochanger "unload slot 7, drive 
> 0" command.
> 25-feb 19:57 bacula-sd: 3304 Issuing autochanger "load slot 3, drive 0" 
> command.
> 25-feb 19:57 bacula-sd: 3305 Autochanger "load slot 3, drive 0", status 
> is OK.
> 25-feb 19:57 bacula-sd: 3301 Issuing autochanger "loaded? drive 0" command.
> 25-feb 19:57 bacula-sd: 3302 Autochanger "loaded? drive 0", result is 
> Slot 3.
> 25-feb 19:57 bacula-sd: Recycled volume "web-004" on device "LTO1" 
> (/dev/lto1), all previous data lost.
> webfs3:      /proc is a different filesystem. Will not descend from / 
> into /proc
> webfs3:      /boot is a different filesystem. Will not descend from / 
> into /boot
> webfs3:      /dev is a different filesystem. Will not descend from / 
> into /dev
> webfs3:      /var/lib/nfs/rpc_pipefs is a different filesystem. Will not 
> descend from / into /var/lib/nfs/rpc_pipefs
> webfs3:      /sys is a different filesystem. Will not descend from / 
> into /sys
> webfs3:      /uno is a different filesystem. Will not descend from / 
> into /uno
> 26-feb 04:14 bacula-sd: End of Volume "web-004" at 594:6519 on device 
> "LTO1" (/dev/lto1). Write of 64512 bytes got -1.
> 26-feb 04:14 bacula-sd: Re-read of last block succeeded.
> 26-feb 04:14 bacula-sd: End of medium on Volume "web-004" 
> Bytes=594,382,602,240 Blocks=9,213,519 at 26-feb-2007 04:14.
> 26-feb 04:14 bacula-dir: Recycled volume "web-005"
> 26-feb 04:14 bacula-sd: 3301 Issuing autochanger "loaded? drive 0" command.
> 26-feb 04:14 bacula-sd: 3302 Autochanger "loaded? drive 0", result is 
> Slot 3.
> 26-feb 04:14 bacula-sd: 3307 Issuing autochanger "unload slot 3, drive 
> 0" command.
> 26-feb 04:15 bacula-sd: 3304 Issuing autochanger "load slot 4, drive 0" 
> command.
> 26-feb 04:15 bacula-sd: 3305 Autochanger "load slot 4, drive 0", status 
> is OK.
> 26-feb 04:15 bacula-sd: 3301 Issuing autochanger "loaded? drive 0" command.
> 26-feb 04:15 bacula-sd: 3302 Autochanger "loaded? drive 0", result is 
> Slot 4.
> 26-feb 04:15 bacula-sd: Recycled volume "web-005" on device "LTO1" 
> (/dev/lto1), all previous data lost.
> 26-feb 04:15 bacula-sd: New volume "web-005" mounted on device "LTO1" 
> (/dev/lto1) at 26-feb-2007 04:15.
> 26-feb 10:21 bacula-sd: End of Volume "web-005" at 528:6656 on device 
> "LTO1" (/dev/lto1). Write of 64512 bytes got -1.
> 26-feb 10:21 bacula-sd: Re-read of last block succeeded.
> 26-feb 10:21 bacula-sd: End of medium on Volume "web-005" 
> Bytes=528,395,664,384 Blocks=8,190,656 at 26-feb-2007 10:21.
> 26-feb 10:21 bacula-dir: Recycled volume "web-006"
> 26-feb 10:21 bacula-sd: 3301 Issuing autochanger "loaded? drive 0" command.
> 26-feb 10:21 bacula-sd: 3302 Autochanger "loaded? drive 0", result is 
> Slot 4.
> 26-feb 10:21 bacula-sd: 3307 Issuing autochanger "unload slot 4, drive 
> 0" command.
> 26-feb 10:22 bacula-sd: 3304 Issuing autochanger "load slot 5, drive 0" 
> command.
> 26-feb 10:22 bacula-sd: 3305 Autochanger "load slot 5, drive 0", status 
> is OK.
> 26-feb 10:22 bacula-sd: 3301 Issuing autochanger "loaded? drive 0" command.
> 26-feb 10:22 bacula-sd: 3302 Autochanger "loaded? drive 0", result is 
> Slot 5.
> 26-feb 10:23 bacula-sd: Recycled volume "web-006" on device "LTO1" 
> (/dev/lto1), all previous data lost.
> 26-feb 10:23 bacula-sd: New volume "web-006" mounted on device "LTO1" 
> (/dev/lto1) at 26-feb-2007 10:23.
> 26-feb 13:49 bacula-sd: Job write elapsed time = 17:48:45, Transfer rate 
> = 21.65 M bytes/second
> 26-feb 13:49 bacula-sd: Alert: SCSI 2 tape drive:
> 26-feb 13:49 bacula-sd: Alert: File number=267, block number=0, partition=0.
> 26-feb 13:49 bacula-sd: Alert: Tape block size 0 bytes. Density code 
> 0x44 (no translation).
> 26-feb 13:49 bacula-sd: Alert: Soft error count since last status=0
> 26-feb 13:49 bacula-sd: Alert: General status bits on (81010000):
> 26-feb 13:49 bacula-sd: Alert:  EOF ONLINE IM_REP_EN
> 26-feb 13:49 bacula-dir: Bacula 2.0.2 (28Jan07): 26-feb-2007 13:49:03
>  JobId:                  12927
>  Job:                    webfs3-job.2007-02-25_19.55.40
>  Backup Level:           Full
>  Client:                 "webfs3" 2.0.2 (28Jan07) 
> i686-redhat-linux-gnu,redhat,Enterprise release
>  FileSet:                "webfs3-fileset" 2005-04-30 07:13:53
>  Pool:                   "webfs" (From Job resource)
>  Storage:                "LTO-1" (From user selection)
>  Scheduled time:         25-feb-2007 19:55:17
>  Start time:             25-feb-2007 19:55:46
>  End time:               26-feb-2007 13:49:03
>  Elapsed time:           17 hours 53 mins 17 secs
>  Priority:               10
>  FD Files Written:       4,046,880
>  SD Files Written:       4,046,880
>  FD Bytes Written:       1,387,910,783,372 (1.387 TB)
>  SD Bytes Written:       1,388,589,182,436 (1.388 TB)
>  Rate:                   21552.4 KB/s
>  Software Compression:   None
>  VSS:                    no
>  Encryption:             no
>  Volume name(s):         web-004|web-005|web-006
>  Volume Session Id:      1
>  Volume Session Time:    1172427565
>  Last Volume Bytes:      266,951,559,168 (266.9 GB)
>  Non-fatal FD errors:    0
>  SD Errors:              0
>  FD termination status:  OK
>  SD termination status:  OK
>  Termination:            Backup OK
> 
> 
> The thing that is not in favour of an hardware or OS problem is that 
> with the same hardware and OS bacula 1.36.3 had not this problem, it 
> arised with 1.38.11.
> The device setup is quite simple:
> 
> 
> Device {
>   Name = LTO1
>   Media Type = LTO-3
>   Archive Device = /dev/lto1
>   AutomaticMount = yes;               # when device opened, read it
>   AlwaysOpen = no;
>   Autoselect = no
>   RemovableMedia = yes;
>   RandomAccess = no;
>   Changer Command = "/etc/bacula/mtx-changer %c %o %S %a %d"
>   Changer Device = /dev/chg4
>   Drive Index = 0
>   AutoChanger = yes
>   Alert Command = "sh -c 'mt -f %a status'"
>   Maximum Network Buffer Size = 65536
> }
> 
> Devices /dev/lto1 and /dev/chg4 are symlinks to real devices in order to 
> manage hardware configuration changes.
> 
> Thanks again
> 
> --------------------------------------------------------------------------
> Ferdinando Pasqualetti
> G.T.Dati srl
> Tel. 0557310862 - 3356172731 - Fax 055720143
> 
> 
> 
> 
> 
> *Arno Lehmann <[EMAIL PROTECTED]>*
> Inviato da: [EMAIL PROTECTED]
> 
> 26/02/2007 20.33
> 
>       
> Per
>       bacula-users <bacula-users@lists.sourceforge.net>
> CC
>       
> Oggetto
>       Re: [Bacula-users] Change tape problem
> 
> 
>       
> 
> 
> 
> 
> 
> Hello,
> 
> On 2/26/2007 10:54 AM, Ferdinando Pasqualetti wrote:
>  >
>  > Hi Bacula users,
>  > sorry if you get this message two times, I sent it with a wrong sender
>  > (not in the list), so I am sending it again.
>  > I am facing a problem that came out with rev. 1.38.11 (I never saw it
>  > with 1.36.3). The problem did not happen all times, but very often. Now
>  > I switched to 2.0.2 and this problem is much more frequent.
>  > The problem is that when a tape was exhausted bacula changes correctly
>  > the tape in the autochanger drive but just after get this error:
>  >
>  > 25-feb 02:47 bacula-sd: End of Volume "web-004" at 594:3362 on device
>  > "LTO1" (/dev/lto1). Write of 64512 bytes got -1.
>  > 25-feb 02:47 bacula-sd: Re-read of last block succeeded.
>  > 25-feb 02:47 bacula-sd: End of medium on Volume "web-004"
>  > Bytes=594,178,937,856 Blocks=9,210,362 at 25-feb-2007 02:47.
>  > 25-feb 02:47 bacula-sd: 3301 Issuing autochanger "loaded? drive 0" 
> command.
>  > 25-feb 02:47 bacula-sd: 3302 Autochanger "loaded? drive 0", result is
>  > Slot 3.
>  > 25-feb 02:47 bacula-sd: 3307 Issuing autochanger "unload slot 3, drive
>  > 0" command.
>  > 25-feb 02:48 bacula-sd: 3304 Issuing autochanger "load slot 4, drive 0"
>  > command.
>  > 25-feb 02:48 bacula-sd: 3305 Autochanger "load slot 4, drive 0", status
>  > is OK.
>  > 25-feb 02:48 bacula-sd: 3301 Issuing autochanger "loaded? drive 0" 
> command.
>  > 25-feb 02:48 bacula-sd: 3302 Autochanger "loaded? drive 0", result is
>  > Slot 4.
>  > 25-feb 02:49 bacula-sd: Wrote label to prelabeled Volume "web-005" on
>  > device "LTO1" (/dev/lto1)
>  > 25-feb 02:49 bacula-sd: New volume "web-005" mounted on device "LTO1"
>  > (/dev/lto1) at 25-feb-2007 02:49.
>  > 25-feb 02:49 bacula-sd: End of Volume "web-005" at 0:1 on device "LTO1"
>  > (/dev/lto1). Write of 64512 bytes got -1.
>  > 25-feb 02:49 bacula-sd: webfs3-job.2007-02-24_20.03.22 Error: Re-read of
>  > last block OK, but block numbers differ. Last block=0 Current 
> block=9210362.
>  > 25-feb 02:49 bacula-sd: Job write elapsed time = 06:43:26, Transfer rate
>  > = 24.52 M bytes/second
>  > 25-feb 02:49 webfs3: webfs3-job.2007-02-24_20.03.22 Fatal error:
>  > backup.c:860 Network send error to SD. ERR=Pipe rotta
>  > 25-feb 02:49 bacula-dir: webfs3-job.2007-02-24_20.03.22 Error: Bacula
>  > 2.0.2 (28Jan07): 25-feb-2007 02:49:25
>  >
>  > It seems there are two problems, the first one (and the most important
>  > one) is that bacula get an end of volume on the new tape,
> 
> What Bacula reports as an EOT can be caused by a drive error, too, so
> for the time being I assume that the second error is tightly related to
> this one.
> 
>  > and the second
>  > one is the difference in the last block (it appears to be the last block
>  > of the previous tape).
> 
> If that's the case, and your description seems quite clear, you might
> have found an OS or hardware bug, too.
> 
> This is only guesswork, but it could be possible that, after a tape
> change, the hardware or the tape driver don't update their state
> information.
> 
> If that's the case, you could try the following:
> - first, have a look at your system log and dmesg output. There might be
>  errors reported there.
> - second, try to reproduce the problem without using Bacula. Unmount the
> tape drive from bconsole. Load a tape (an unused one, or one with write
> protection). If you use an empty tape, write some data and some file
> marks to it, ending with an EOT mark. dd and mt are tools for that purpose.
> Then, use tapeinfo or st to observe the tape status, especialy the block
> position reported, when doing some rewinds, fast forwards, offline, and
> see what happens after you used mtx to unload and reload that tape.
> 
> If there really is a problem with the hardware or the OD driver, you
> should be able to reproduce it then. Updating the drive firmware and the
> OS (or, if that's up to date, filing a bug report) would be two options
> then.
> 
> Otherwise, you should run btape again, because there are some things in
> the report I don't like - errors writing the last block to tape should
> not happen with current hardware, for example. You might try to tune
> your device configuration, and perhaps you'll have to set the tape
> driver to a different write mode. Suggesting something is difficult
> without seeing how it's setup now :-)
> 
>  > Bacula is a MySQL version on a RedHat AS 4.04, rpmbuilt on that system,
>  > an HP proliant G3 3.2 Ghz, 2Gb RAM.
>  > The tape is an MSL6000 with two LTO-3 drives, drived by bacula directly
>  > (not using the autochanger as device - 1.36.3 setup).
>  > Btape tests run correctly, including the "fill and change tape" (I am
>  > attaching the test result, if someone is interested).
>  > Did anyone get a similar problem?
> 
> That basic setup should run ok I think... nothing unusual there.
> 
> Arno
> 
>  >
>  > 
> --------------------------------------------------------------------------
>  > Ferdinando Pasqualetti
>  > G.T.Dati srl
>  > Tel. 0557310862 - 3356172731 - Fax 055720143
>  >
>  >
>  >
>  > ------------------------------------------------------------------------
>  >
>  > -------------------------------------------------------------------------
>  > Take Surveys. Earn Cash. Influence the Future of IT
>  > Join SourceForge.net's Techsay panel and you'll get the chance to 
> share your
>  > opinions on IT & business topics through brief surveys-and earn cash
>  > http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
>  >
>  >
>  > ------------------------------------------------------------------------
>  >
>  > _______________________________________________
>  > Bacula-users mailing list
>  > Bacula-users@lists.sourceforge.net
>  > https://lists.sourceforge.net/lists/listinfo/bacula-users
> 
> -- 
> IT-Service Lehmann                    [EMAIL PROTECTED]
> Arno Lehmann                  http://www.its-lehmann.de
> 
> -------------------------------------------------------------------------
> Take Surveys. Earn Cash. Influence the Future of IT
> Join SourceForge.net's Techsay panel and you'll get the chance to share your
> opinions on IT & business topics through brief surveys-and earn cash
> http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
> _______________________________________________
> Bacula-users mailing list
> Bacula-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/bacula-users
> 
> 
> 
> ------------------------------------------------------------------------
> 
> -------------------------------------------------------------------------
> Take Surveys. Earn Cash. Influence the Future of IT
> Join SourceForge.net's Techsay panel and you'll get the chance to share your
> opinions on IT & business topics through brief surveys-and earn cash
> http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
> 
> 
> ------------------------------------------------------------------------
> 
> _______________________________________________
> Bacula-users mailing list
> Bacula-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/bacula-users

-- 
IT-Service Lehmann                    [EMAIL PROTECTED]
Arno Lehmann                  http://www.its-lehmann.de

-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

Reply via email to