On Tue, 8 Jan 2013 19:53:41 +0000 (UTC)
Grant Edwards <grant.b.edwa...@gmail.com> wrote:

> On 2013-01-08, Pandu Poluan <pa...@poluan.info> wrote:
> > On Jan 8, 2013 11:20 PM, "Florian Philipp" <li...@binarywings.net>
> > wrote:
> >>
> >
> > -- snip --
> >
> >>
> >> Hmm, good idea, albeit similar to the `md5sum -c`. Either tool
> >> leaves you with the problem of distinguishing between legitimate
> >> changes (i.e. a user wrote to the file) and decay.
> >>
> >> When you have completely static content, md5sum, rsync and friends
> >> are sufficient. But if you have content that changes from time to
> >> time, the number of false-positives would be too high. In this
> >> case, I think you could easily distinguish by comparing both file
> >> content and time stamps.
> >>
> >> Now, that of course introduces the problem that decay could occur
> >> in the same time frame as a legitimate change, thus masking the
> >> decay. To reduce this risk, you have to reduce the checking
> >> interval.
> >>
> >> Regards,
> >> Florian Philipp
> >
> > IMO, we're all barking up the wrong tree here...
> >
> > Before a file's content can change without user involvement, bit
> > rot must first get through the checksum (CRC?) of the hard disk
> > itself. There will be no 'gradual degradation of data', just
> > 'catastrophic data loss'.
> 
> When a hard drive starts to fail, you don't unknowingly get back
> "rotten" data with some bits flipped.  You get either a "seek error"
> or "read error", and no data at all.  IIRC, the same is true for
> attempts to read a failing CD.

I see what Florian is getting at here, and he's perfectly correct.

We techie types often like to think our storage is purely binary, the
cells are either on or off and they never change unless we
deliberately make them change. We think this way because we wrap our
storage in layers to make it look that way, in the style of an API.


The truth is that our storage is subject to decay. Harddrives are
magnetic at heart, and atoms have to align and stay aligned for the
drive to work. Floppies are infinitely worse at this, but drives are
not immune. Writeable CDs do not have physical pits and lands like
factory original discs have, they use chemicals to make reflective and
non-reflective spots. The list of points of corruption is long and
they all happen after the data has been committed to physical storage.

Worse, you only know about the corruption by reading it, there is no
other way to discover if the medium and the data are still OK. He wants
to read the medium occasionally and verify it while the backups are
still usable, and not wait for the point of no return - the "read error"
from a medium that long since failed.

Maybe Florian's data is valuable enough to warrant worth the effort. I
know mine isn't, but his might be.


-- 
Alan McKinnon
alan.mckin...@gmail.com


Reply via email to