Steve Kleene wrote: > On Mon, 24 Oct 2011 11:44:11 +0000 (UTC), I wrote: > > That isn't a good error message. I think your disk is failing. > > Review your /var/log/syslog and look for error messages there. I > > expect you will see other errors logged there.
Really? I thought *I* wrote that. Wait, I did. :-) I think you mail attribution processing isn't configured right. > I was afraid of that. For what it's worth, there is nothing suspicious in > syslog. Mostly it's just a list of all the e-mails sent and received. I do > keep this machine very well backed up. With backup you are in good shape. You might try forcing a read of every sector. (e.g. dd if=/dev/sda of=/dev/null bs=4k or some such) Because during normal use there will be only a few sectors that are actually exercised. And perhaps others will suggest better diagnostics. > > You didn't say what type of media you are using. Spinning disk? SSD? > > Other? > > It's an old spinning disk (Maxtor DiamondMax Plus 8 6K040L0 40GB ATA/133 > HDD). The date on it is 10/31/03. It's sufficient for this machine's > purpose. Yes. Plenty sufficient for many purposes. No complaints here. I have several of those still running. > > # smartctl -H /dev/sda > SMART overall-health self-assessment test result: PASSED (shrug) In my experience it isn't a great predictor of failure. But it often confirms failure. > 40 51 01 af 49 c3 e2 Error: UNC 1 sectors at LBA = 0x02c349af = 46352815 Looks like an uncorrected read error. > > # smartctl -l selftest /dev/sda > > Num Test_Description Status Remaining LifeTime(hours) > LBA_of_first_error > # 1 Short offline Completed: read failure 60% 15032 46769249 After a short selftest it reported a read failure. > I'm not sure how to interpret all of that output, but it looks bad. Thanks > for your help. If this disk were a ship at see then it should be sending a distress signal. In the future you might consider installing the same smartmontools and configuring /etc/smartd.conf to automatically run selftests on a regular basis. Perhaps something like this example from one of my systems. # Monitor all attributes, enable automatic online data collection, # automatic Attribute autosave, and start a short self-test every day # between 2-3am, and a long self test Saturdays between 3-4am. # On failure run all installed scripts. # Ignore attribute 194 temperature change. # Ignore attribute 190 airflow temperature change. /dev/sda -a -o on -S on -s (S/../../[1-5]/03|L/../../6/03) -I 194 -I 190 -m root -M exec /usr/share/smartmontools/smartd-runner With the above automatically running and being monitored then if there is a selftest failure such as the one you are seeing the runner scripts will email a warning message to root. You will be notified of the problem automatically. Most of the time it works that way anyway and most of the time it is a good warning of the problem. Good luck! Bob
signature.asc
Description: Digital signature