On Tuesday 20 January 2009 20:25:44 Pasi Kärkkäinen wrote: > On Tue, Jan 20, 2009 at 07:12:56PM +0100, Kern Sibbald wrote: > > On Tuesday 20 January 2009 14:06:01 Pasi Kärkkäinen wrote: > > > On Mon, Jan 19, 2009 at 05:10:53PM +0100, Kern Sibbald wrote: > > > > On Monday 19 January 2009 16:38:53 Pasi Kärkkäinen wrote: > > > > > On Mon, Jan 19, 2009 at 04:11:18PM +0100, Kern Sibbald wrote: > > > > > > Hello again, > > > > > > > > > > > > Just to clarify, so that no one panics. The bug I mentioned > > > > > > occurred only in the 2.5.x code for a relatively short time. > > > > > > Whether or not it actually hit you I cannot say, but in any case, > > > > > > the current versions are fixed, and the new code includes > > > > > > significant optimization compared to 2.4.x and older 2.5.x for > > > > > > Migration and Copy jobs. > > > > > > > > > > Yep. I'm running 2.5 development version, and I'm aware there might > > > > > be issues. Someone needs to test this stuff! :) > > > > > > > > > > Original backup job report: > > > > > > > > > > FD Files Written: 232,812 > > > > > SD Files Written: 232,812 > > > > > FD Bytes Written: 126,585,056,881 (126.5 GB) > > > > > SD Bytes Written: 126,632,767,722 (126.6 GB) > > > > > > > > > > Copy job report: > > > > > > > > > > SD Files Written: 232,812 > > > > > SD Bytes Written: 126,632,767,722 (126.6 GB) > > > > > > > > > > > > > > > Another backup job report: > > > > > > > > > > FD Files Written: 209,483 > > > > > SD Files Written: 209,483 > > > > > FD Bytes Written: 33,861,760,972 (33.86 GB) > > > > > SD Bytes Written: 33,899,761,248 (33.89 GB) > > > > > > > > > > Copy job report: > > > > > > > > > > SD Files Written: 209,483 > > > > > SD Bytes Written: 33,899,761,248 (33.89 GB) > > > > > > > > > > > > > > > So at least the file counts seems to match.. and also the SD bytes > > > > > written.. > > > > > > > > Well, it looks like the copies worked, but the errors you saw should > > > > not be there ... > > > > > > Yeah.. wondering why the job status/termination is "OK" if with these > > > errors in the logs? > > > > > > > > Anyway, I'll upgrade Bacula now and we'll see if these Volume data > > > > > errors disappear for copy jobs. > > > > > > > > OK > > > > > > Now running SVN revision 8381 (2.5.29). > > > > > > Unfortunately I still see these errors.. > > > > > > First it was all OK without errors for some hours, but then: > > > > > > Ready to read from volume "Pool2-Vol-0104" on device "FSDevice2" > > > (/mnt/backup1/pool02). Forward spacing Volume "Pool2-Vol-0104" to > > > file:block 0:218. > > > Error: block.c:1098 Volume data error at 0:3599769803! Short block of > > > 7988 bytes on device "FSDevice2" (/mnt/backup1/pool02) discarded. > > > Error: read_record.c:148 block.c:1098 Volume data error at > > > 0:3599769803! Short block of 7988 bytes on device "FSDevice2" > > > (/mnt/backup1/pool02) discarded. End of file 0 on device "FSDevice2" > > > (/mnt/backup1/pool02), Volume "Pool2-Vol-0104" > > > > > > .. And then it continues OK with the next file volume, and then again > > > similar errors for the next file volume: > > > > > > Ready to read from volume "Pool2-Vol-0117" on device "FSDevice2" > > > (/mnt/backup1/pool02). Forward spacing Volume "Pool2-Vol-0117" to > > > file:block 0:218. > > > Error: block.c:1098 Volume data error at 1:2863735978! Short block of > > > 27477 bytes on device "FSDevice2" (/mnt/backup1/pool02) discarded. > > > Error: read_record.c:148 block.c:1098 Volume data error at > > > 1:2863735978! Short block of 27477 bytes on device "FSDevice2" > > > (/mnt/backup1/pool02) discarded. End of file 1 on device "FSDevice2" > > > (/mnt/backup1/pool02), Volume "Pool2-Vol-0117" > > > > > > I don't see any errors in kernel dmesg and/or syslog. > > > > > > Any suggestions? > > > > Were the files that are being read written with Bacula version 2.5.29? > > There's a big chance those files were created using the older Bacula 2.5 > version. I'll have to check that..
Yes, that would be the first thing to check. I don't expect any problems in the database or with older backups, but it is worth confirming -- it was just the copying process (actually the seeking) where we had a problem. > > > What kind of device is /mnt/backup1/poolnn? > > > > If it is some sort of network mount, then you probably have a bad driver, > > bad network, or something wrong on the other end, and you should try > > running using local disk. > > Hmm.. it's iSCSI volume. I assume I'd had SCSI errors in the logs, or > filesystem errors if that was the reason.. everything is possible, of > course.. If it is a driver bug on either side, it might not necessairly show up in the logs. With iSCSI, there is a lot of software, and thus the possibility for lots of problems, especially since it is not 20 year old technology. Depending on what OS you are using, there may be some problem with the way Bacula does seeking on hard disk. I would still recommend try writing to a locally mounted disk. If the problem still occurs with locally mounted disk, then it will point strongly toward Bacula. ------------------------------------------------------------------------------ This SF.net email is sponsored by: SourcForge Community SourceForge wants to tell your story. http://p.sf.net/sfu/sf-spreadtheword _______________________________________________ Bacula-devel mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/bacula-devel
