On Tue, Jan 8, 2013 at 2:06 PM, Florian Philipp <li...@binarywings.net> wrote: > Am 08.01.2013 18:35, schrieb Volker Armin Hemmann: >> Am Dienstag, 8. Januar 2013, 08:27:51 schrieb Florian Philipp: >>> Am 08.01.2013 00:20, schrieb Alan McKinnon: >>>> On Mon, 07 Jan 2013 21:11:35 +0100 >>>> >>>> Florian Philipp <li...@binarywings.net> wrote: >>>>> Hi list! >>>>> >>>>> I have a use case where I am seriously concerned about bit rot [1] >>>>> and I thought it might be a good idea to start looking for it in my >>>>> own private stuff, too. >>> >>> [...] >>> >>>>> [1] http://en.wikipedia.org/wiki/Bit_rot > [...] >>>> If you mean disk file corruption, then doing it file by file is a >>>> colossal waste of time IMNSHO. You likely have >1,000,000 files. Are >>>> you really going to md5sum each one daily? Really? >>> >>> Well, not daily but often enough that I likely still have a valid copy >>> as a backup. >> >> and who guarantees that the backup is the correct file? >> > > That's why I wanted to store md5sum (or sha2sums). > >> btw, the solution is zfs and weekly scrub runs. >> > > Seems so. >
And, while it's not exceptionally likely, there's always a possibility that the checksum table, rather than the file being checked itself, is the location of the corruption, meaning you have to verify that as well when discrepancies occur. The likelihood of the perfect few bits flipping to match the corrupted data with a corrupted hash, within the time between checks, however, I would think is low enough to gamble on never seeing it in a reasonable lifetime. -- Poison [BLX] Joshua M. Murphy