On 08/30/2010 06:35 PM, Tobias Brink wrote: > Steve Costaras <stev...@chaven.com> writes: > >> Could be due to a transient error (transmission or wild/torn read at >> time of calculation). I see this a lot with integrity checking of >> files here (50TiB of storage). >> >> Only way to get around this now is to do a known-good sha1/md5 hash of >> data (2-3 reads of the file make sure that they all match and that the >> file is not corrupted) save that as a baseline and then when doing >> reads/compares if one fails do another re-read and see if the first >> one was in error and compare that with your baseline. This is one >> reason why I'm switching to the new generation of sas drives that have >> ioecc checks on READS not just writes to help cut down on some of >> this. >> >> Corruption does occur as well and is more probable with the higher the >> capacity of the drive. Ideally you would have a drive that would >> do ioecc on reads, plus using T10 PI extensions (DIX/DIF) from drive >> to controller up to your file system layer. It won't always prevent >> it by itself but would allow if you have a raid setup to do some >> self-healing when a drive reports a non transient (i.e. corrupted >> sector of data). > > First off, thanks for the answers. The thing is that I am well aware of > the reliability problems of hard drives and I would love to use some > advanced file system like ZFS or btrfs, but I am on Debian and I will > stay on Debian. And btrfs is not mature enough to be used in production > at the moment. The other thing is that I do not think that this is an > issue of corruption of the data itself! As I said I checked the files > against backups and MD5 sums supplied by Debian (several times and from > cold cache) and the data seems to be OK. The executables that are > reported by Bacula to have changed continue to work well and bug-free > just as before. > > So I think this is a problem/bug with either the Postgresql database or > Bacula, not with my hard drives. I just wonder how something like this > could happen and how I could avoid this. I'm also not willing to do > additional checksums with other programs (AIDE or similar) because they > take _lots_ of time to run. With Bacula I get the checksums for free. > I just want to use them to detect corruption on disk from time to time > and because I use VirtualFull and want to know if my differential > backups have missed something. > > So I still don't know how to proceed. Apart from that I will try to > upgrade my director and sd to 5.0.2 as soon as Debian backports are > available and see if the problem goes away. I will also re-run the > DiskToCatalog after my next differential backup and see if something > is different. > > Thanks, > Tobias >
Tobias, I use this little python script to extract information which I used to track duplicates files (users are users :-) Hope this could help you a bit to have inspiration and decode the lstat column. (If I remember, someone has also do the same in pl/pgsql: check the archives list) #!/usr/bin/python # -*- coding: utf-8 -*- # # call it with a jobid and pipe it to csv file # import sys import time import MySQLdb jobid = sys.argv[1] def base64_decode_lstat(record, position): b64 = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/" val = 0 size = record.split(' ')[position] for i in range(len(size)): val += (b64.find(size[i])) * (pow(64,(len(size)-i)-1)) return val # Adjust localhost,username,passwd,dbname db = MySQLdb.connect(host="localhost", user="bacula", passwd="bacula", db="bacula") db.set_character_set('utf8') cursor = db.cursor() cursor.execute('SET NAMES utf8;') cursor.execute('SET CHARACTER SET utf8;') cursor.execute('SET character_set_connection=utf8;') cursor.execute("SELECT File.MD5 as cheksum, convert(Filename.Name using utf8) as filename, convert(Path.Path using utf8) as path, File.LStat as lstat\ FROM File, Filename, Path \ WHERE File.JobId = '%s' \ AND Filename.FilenameId = File.FilenameId \ AND Path.PathId = File.PathId\ ORDER BY File.MD5,Filename.Name,Path.Path" % jobid) result = cursor.fetchall() # no headerprint '"checksum";"filename";"path";"lstat";"gid";"uid";"bytes";"blocksize";"blocks_allocated";"atime";"mtime";"ctime"' for record in result: print '"%s";"%s";"%s";"%s";"%s";"%s";"%s";"%s";"%s";"%s";"%s";"%s"' % ( record[0] , record[1] , record[2] , record[3] , \ base64_decode_lstat(record[3],5) , \ base64_decode_lstat(record[3],6) , \ base64_decode_lstat(record[3],7) , \ base64_decode_lstat(record[3],8) , \ base64_decode_lstat(record[3],9) , \ base64_decode_lstat(record[3],10) , \ base64_decode_lstat(record[3],11) , \ base64_decode_lstat(record[3],12) \ ) # no empty line at end print -- Bruno Friedmann br...@ioda-net.ch Ioda-Net Sàrl www.ioda-net.ch openSUSE Member User www.ioda.net/r/osu Blog www.ioda.net/r/blog fsfe fellowship www.fsfe.org (bruno.friedmann (at) fsfe.org ) tigerfoot on irc GPG KEY : D5C9B751C4653227 ------------------------------------------------------------------------------ This SF.net Dev2Dev email is sponsored by: Show off your parallel programming skills. Enter the Intel(R) Threading Challenge 2010. http://p.sf.net/sfu/intel-thread-sfd _______________________________________________ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users