On Thursday 22 March 2007 16:31, Ekkehard Burkon wrote:
> Hi,
> 
> Kern Sibbald wrote:
> 
> > 
> > Two things:
> > 
> > 1. Something seems wrong (possibly with your configuration).  By default 
> > Bacula writes blocks of 64,512 bytes and not 65536 as indicated above on 
the 
> > error message.  This makes me wonder.  In any case, I don't recommend that 
> > you change the default -- especially when using disk storage without doing 
> > careful performance testing.
> 
> This came from tests with various blocksizes. It has been eliminated by now.
> > 
> > 2. 99.9% of the time, if you are getting checksum errors on a disk, you 
have a 
> > bad disk, or in this case, possibly a bad disk driver (much less often, 
bad 
> > RAM).
> > 
> > I recommend that you try doing a backup/restore of a small amount of data 
> > (e.g. 100MB) to a standard Debian fixed disk formatted with ext3.  That 
will 
> > give you a good base of information.  If you get the same errors, then it 
is 
> > most likely your conf file, and I would *definitely* like to see it.  
> > Otherwise you can duplicate the conf on the disk storage device and repeat 
> > the test using a small amount of data to start with.
> > 
> 
> I plugged a 250 Gig SATA drive into the standard onboard SATA 
> controller, formated it with ext3 and did some backup and restore
> tests. It seems these worked. Tried restores up to 100 GB and all
> worked well.
> 
> What this boils down to is that there seems to be a problem within the 
> 3ware RAID Controller and/or its driver.

I think we are both in agreement on that.

> 
> What is strange though is that the whole system runs of the same raid 5
> array. And I have no problems with file corruption or anything else.
> The second thing is that we had similar problems with other 3ware SATA
> raid controllers using FreeBSD.

Well, probably a more accurate statement is that you have simply been lucky 
not to have been hit by any serious problems until now.

> 
> Is it possible that the error is triggered of by some special type
> of file access that bacula uses or anything like that?

Most (probably 99.9%) of all programs reference disk by using fread/fwrite 
functions that read and write blocks of data that are aligned on sector 
boundries (or even on 4096 or 8192 byte boundries).  Bacula on the other hand 
does unbuffered binary synchronous (blocking) I/O to the file using 
*standard* OS read/write/lseek calls.  Where Bacula is different is that the 
read/writes that Bacula does are not necessarily aligned on any sector or 
other multiple, but can be at arbitrary locations in the file. Note, normally 
for performance reasons, Bacula will write big buffers, but when reading, it 
will lseek() to the proper place and read only the data that it needs.

If these kinds of accesses are causing problems with the disk driver (as seems 
very likely), then the driver is simply broken and your system is in serious 
danger.  It is sort of like a ticking time bomb.

> 
> Next steps for me is to build the newest kernel (vanilla 2.6.20.3) and
> update the controller firmware.

Certainly updating the controller firmware should be your number one priority. 
There is a good chance that if this is the problem that they have fixed it 
already. Your number two priority should be to look at the manufacturer's web 
site for bug reports, and the third priority should be to get in touch with 
the manufacturer.

> 
> Do you have any hint what I can do to track down the caus of the 
> problem. Ah the Controller is a 3ware 9500S-12  .

If I were having the problem, I would waste no time getting any critical work 
off that machine or at least off those disks, then I would write (or have 
someone write) a program that does more or less random I/O on a file in 
random places with random sizes, and carefully verifies that what was written 
can be properly read back, and that it can be written with one set of 
blocking sizes and read back with a totally different set of sizes.  Such 
programs probably already exist.

Another thing you can try with Bacula is to add the following directive:

  Block Positioning = no

to the Device resource in the bacula-sd.conf file.  This will prevent Bacula 
from doing lseek()s on the Volume during restores, and Bacula will *probably* 
read the blocks in big chunks that are aligned (much like fread).  I say 
*probably* because one would have to carefully examine the code to be sure. 
If the problem goes away with this directive added, AND the same data written 
to a standard non-3ware disk file has no problems with Block Positioning 
turned on, then you have pretty much proved that the problem is that the 
driver (firmware, hardware, OS, or something) is not handling lseek()s and/or 
non-sector aligned read()s correctly.

Best regards,

Kern

Please let us know how this turns out ...

> 
> Thank you very muche for your help.
>     Ekkehard
> 
> 
> -- 
> : Trusted Network GmbH
> : Max-Planck-Str. 1
> : D-85716 Unterschleissheim
> : Telefon: 089 / 22 8 44 117
> : Telefax: 089 / 37 00 66 43
> : E-Mail: [EMAIL PROTECTED]
> : Web: http://www.tnib.de
> : Geschaeftsfuehrer: Joerg Staedele, Stefan Kinner
> : Sitz der Gesellschaft: Unterschleissheim
> : Registergericht: AG Muenchen HRB 108 388
> 
> -------------------------------------------------------------------------
> Take Surveys. Earn Cash. Influence the Future of IT
> Join SourceForge.net's Techsay panel and you'll get the chance to share your
> opinions on IT & business topics through brief surveys-and earn cash
> http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
> _______________________________________________
> Bacula-users mailing list
> Bacula-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/bacula-users
> 

-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

Reply via email to