This is getting embarassing, hdparm does obvously also need to know
which drive to read from, something like "hdparm --read-sector 307316
/dev/sda".
I'll not bother the entire list with that :)
On 29. sep. 2014 10:48, Håkon Alstadheim wrote:
On 29. sep. 2014 09:32, Julien boooo wrote:
SMART Self-test log structure revision number 1
Num Test_Description Status Remaining
LifeTime(hours) LBA_of_first_error
# 1 Extended offline Completed: read failure 90%
35888 307316
# 2 Short offline Completed: read failure 90%
35887 330254
# 3 Extended offline Completed: read failure 90%
35887 410646
Sorry for the verbiage, but you might have the clues you need to start
reassigning sectors here, though I have on occasion seen that the LBA
is reported erroneously by smartctl. You will find out soon enough if
you do
----
$ hdparm --read-sector 307316
$ hdparm --read-sector 330254
$ hdparm --read-sector 410646
---
If you get errors from the above commands, you need to reassign those
sectors. If not, then smartctl may be reporting erroneously because
the drive has not been able to store the correct value of the sector
where the error occured in its internal log. Your numbers look good
though (they do not look like a single "highest possible integer" that
you would most likely get if the values are wrong).
If smartcl is in error, you need to find the error when they happened
in your system logs. I.E. you need to find the bad sectors somewhere
like /var/log/syslog (or is it /var/log/messages ? ) . I forget. grep
for 'SAT' or 'ATA' in your logs.
It may also be that you have "lucked out", and the sectors have been
written to, and thus reassigned automatically. This will make the next
read from that sector succeed, if the drive is not totally beyond repair.
And, like I said at first, this is merely a stop-gap-while your drive
is getting progressively worse, and %wa goes up in "top" (you never
told us how much wait you have).
So your plan should be:
1) Back up everything
2) Order a new drive
3) muddle through while you wait for your replacement.
You should consider ordering TWO drives, and run them in a mirror.
Then you can set error-timeout to 7 seconds and not experience such
bad performance the next time a drive starts failing. DO NOT set that
error timeout if you only have one drive, or chances of data-loss will
increase.
Remember, if your drive is in warranty, a replacement is free.