See below
Tony van der Hoff wrote:
On 17/10/15 17:47, Miles Fidelman wrote:
Dominique Dumont wrote:
On Saturday 17 October 2015 14:15:52 Tony van der Hoff wrote:
Can anyone please explain what it means, and whether I should be
worried?
You should check the drive with smartctl.
See http://www.smartmontools.org/
HTH
Yes.. and be sure to go beyond the basic tests.
First off, make sure it's running:
smartctl -s on -A /dev/disk0 ;for each drive, and using the
appropriate /dev/..
Then after, it's accumulated some stats:
smartctl -A /dev/disk0
For a lot of drives, the first line - raw read errors, can be very
telling - anything other than 0, and your disk is failing.
Start-up-time can be telling, if it's increasing.
The thing is, that most drives, except those designed for use in RAID
arrays, mask impending disk failures, by re-reading blocks multiple
times - they often get the data eventually, but your machine keeps
getting slower and slower.
Thanks Miles, and tomás, for your helpful replies.
I apologise for the delay in replying, but I've been away from my desk
a few days.
I have however been doing some extensive googling, and it would appear
that the raw read error count is something of a red herring,
especially when applied to Seagate drives, as these are. Both my
drives have quite high (in the millions) of RREC; numbers which are
precisely matched by the Hardware ECC Recovered counts, suggesting
that the RREC is merely an artifact od HHDs being essentially a
mechanical device, being pushed to its limits using clever technology.
The SMART extended tests reveal no problems.
The Wikipedia entry https://en.wikipedia.org/wiki/S.M.A.R.T. is
particularly informative in the relative importance of these error
counts; the RREC can be safely ignored, as somebody else here recently
suggested.
You're missing the point.
As the Wikipedia also points out:
<https://en.wikipedia.org/wiki/S.M.A.R.T.#cite_note-seagate1-2>"Mechanical
failures account for about 60% of all drive failures." and "Further, 36%
of drives failed without recording any S.M.A.R.T. error at all, except
the temperature, meaning that S.M.A.R.T. data alone was of limited
usefulness in anticipating failures."
Today's disk drives are designed to PROTECT DATA, AND MAINTAIN ACCESS TO
DATA, until the very moment before the drive fails catastrophically.
The "Hardware ECC Recovered Count" indicates that:
- there are likely to be problems with the underlying media that the ECC
is recovering from, that will only get worse over time
- the recovery takes time, hence the reason you system is slowing down -
the more underlying errors, the more time it takes to recover
I've never found SMART extended tests to be indicative of anything,
until a disk is nearly dead. Though
http://www.z-a-recovery.com/manual/smart.aspx gives a good list of other
SMART variables that might indicate mechanical failures.
If your drives are a couple of years old, and your machine is getting
slower, don't engage in wishful thinking - backup and get new drives.
Miles
--
In theory, there is no difference between theory and practice.
In practice, there is. .... Yogi Berra