On 2013-01-08, Alan McKinnon <alan.mckin...@gmail.com> wrote:

>> When a hard drive starts to fail, you don't unknowingly get back
>> "rotten" data with some bits flipped.  You get either a "seek error"
>> or "read error", and no data at all.  IIRC, the same is true for
>> attempts to read a failing CD.
>
> I see what Florian is getting at here, and he's perfectly correct.
>
> We techie types often like to think our storage is purely binary, the
> cells are either on or off and they never change unless we
> deliberately make them change. We think this way because we wrap our
> storage in layers to make it look that way, in the style of an API.
>
> The truth is that our storage is subject to decay. Harddrives are
> magnetic at heart, and atoms have to align and stay aligned for the
> drive to work. Floppies are infinitely worse at this, but drives are
> not immune. Writeable CDs do not have physical pits and lands like
> factory original discs have, they use chemicals to make reflective and
> non-reflective spots. The list of points of corruption is long and
> they all happen after the data has been committed to physical storage.

True.  But, in my experience, the chances of any of those failures
resulting in a successful read of incorrect data is vanishly small.

> Worse, you only know about the corruption by reading it, there is no
> other way to discover if the medium and the data are still OK. He
> wants to read the medium occasionally

That may be a good idea, and will detect media failures.

> and verify it

That's the part I think is pointless in practice (if you're trying to
detect failing media).

> while the backups are still usable, and not wait for the point of no
> return - the "read error" from a medium that long since failed.

My point is that _comparing_data_to_a_backup_ just isn't a useful,
practical way to detect failing hard drives, optical drives, or CDs. 
I've seen a lot of hard drives, optical drives, floppy drives,
flopies, and CDs fail. The failure mode in every case has been a "seek
error" or "read error" resulting in _no_data_ rather than a read
returning erroneous data.

It seems that in laboratory conditions, people have managed to see
erroneous data, but I'm not convinced worrying about it is worthwhile.

IMO, having backup data _is_ very valuable, but regularly reading
files and comparing them to backup copies isn't a useful way to detect
failing media.

You're much more likely to detect failing RAM (which is useful, but
there are better ways to do it).

-- 
Grant Edwards               grant.b.edwards        Yow! I think I am an
                                  at               overnight sensation right
                              gmail.com            now!!


Reply via email to