On Wed, 24 Aug 2005, Stuart Henderson wrote:
--On 24 August 2005 10:37 +0200, Ramiro Aceves wrote:
pciide0:0:1: bus-master DMA error: missing interrupt, status=0x61
wd1a: device timeout reading fsbn 1489200 of 1489200-1489203 (wd1 bn
1489263; cn 1477 tn 7 sn 6), retrying
wd1: soft error (corrected)
wd1(pciide0:0:1): timeout
type: ata
c_bcount: 2048
c_skip: 0
pciide0:0:1: bus-master DMA error: missing interrupt, status=0x61
wd1a: device timeout reading fsbn 1486176 of 1486176-1486179 (wd1 bn
1486239; cn 1474 tn 7 sn 6), retrying
wd1: soft error (corrected)
[etc]
All hard drives have bad blocks, most hard drives now have some spare
capacity. As the drive detects bad or failing blocks, the spare blocks are
automatically remapped over the bad blocks. This is internal to the drive -
by the time you start noticing drive errors, the drive is usually unable to
remap any more blocks.
smartmontools does a great job of notifying you prior to this occurring.
When you startup smartd to alert when S.M.A.R.T attributes change, you can
watch the drive slowly die over time. smartmontools is part of the OpenBSD
ports tree in case you interested in giving it a spin.
Sometimes the manufacturer's drive-test tools can be useful (Hitachi/IBM's
DFT can do some basic tests on drives from other manufacturers too). There's
also a commercial program Spinrite which claims to have good stress-tests.