Am 08.01.2013 18:41, schrieb Pandu Poluan: > > On Jan 8, 2013 11:20 PM, "Florian Philipp" <li...@binarywings.net > <mailto:li...@binarywings.net>> wrote: >> > > -- snip -- > [...] >> >> When you have completely static content, md5sum, rsync and friends are >> sufficient. But if you have content that changes from time to time, the >> number of false-positives would be too high. In this case, I think you >> could easily distinguish by comparing both file content and time stamps. >> [...] > > IMO, we're all barking up the wrong tree here... > > Before a file's content can change without user involvement, bit rot > must first get through the checksum (CRC?) of the hard disk itself. > There will be no 'gradual degradation of data', just 'catastrophic data > loss'. >
Unfortunately, that's only partly true. Latent disk errors are a well researched topic [1-3]. CRCs are not perfectly reliable. The trick is to detect and correct errors while you still have valid backups or other types of redundancy. The only way to do this is regular scrubbing. That's why professional archival solutions offer some kind of self-healing which is usually just the same as what I proposed (plus whatever on-access integrity checks the platform supports) [4]. > I would rather focus my efforts on ensuring that my backups are always > restorable, at least until the most recent time of archival. > That's the point: a) You have to detect when you have to restore from backup. b) You have to verify that the backup itself is still valid. c) You have to avoid situations where undetected errors creep into the backup. I'm not talking about a purely theoretical possibility. I have experienced just that: Some data that I have kept lying around for years was corrupted. [1] Schwarz et.al: Disk Scrubbing in Large, Archival Storage Systems http://www.cse.scu.edu/~tschwarz/Papers/mascots04.pdf [2] Baker et.al: A fresh look at the reliability of long-term digital storage http://arxiv.org/pdf/cs/0508130 [3] Bairavasundaram et.al: An Analysis of Latent Sector Errors in Disk Drives http://bnrg.eecs.berkeley.edu/~randy/Courses/CS294.F07/11.1.pdf [4] http://uk.emc.com/collateral/analyst-reports/kci-evaluation-of-emc-centera.pdf Regards, Florian Philipp
signature.asc
Description: OpenPGP digital signature