One of our new FreeBSD 3.5-REL systems is periodically locking up,
due to an apparent disk error. These are brand-new IBM 7200 RPM
60 GB ATA/66 EIDE drives, in a ccd configuration as follows:
Filesystem 1K-blocks Used Avail Capacity Mounted on
/dev/da0s1a 79359 20357 52654 28% /
/dev/da0s1f 8025325 1023010 6360289 14% /usr
/dev/da0s1e 119055 3401 106130 3% /var
procfs 4 4 0 100% /proc
/dev/ccd0c 239854317 127105965 93560007 58% /spool
cat /etc/ccd.conf
# ccd ileave flags component devices
ccd0 128 none /dev/wd0s1e /dev/wd2s1e /dev/wd1s1e /dev/wd3s1e
and the error message (which repeats infinitely) is:
Jul 31 01:02:06 news /kernel: wd0s1e: soft error reading fsbn 81057521
of 81057520-81057551 (wd0s1 bn 81057521; cn 64331 tn 7 sn 20)
(status 58<rdy,seekdone,drq> error 1<no_dam>)
Jul 31 01:02:16 news /kernel: wd0s1e: soft error reading fsbn 81057521
of 81057520-81057551 (wd0s1 bn 81057521; cn 64331 tn 7 sn 20)
(status 58<rdy,seekdone,drq> error 1<no_dam>)
The system just gets stuck doing this seek over and over again, at
which point it becomes impossible to log in via the console, or do
anything else (I/O bound).
Is there a trick to getting soft-recovery working with EIDE devices?
Better yet, how can I get rid of this problem without getting rid of
the new drives? It seems that it sits there trying to recover from
this "soft error" but never does, and never maps it in the replacement
block table as "bad".
Is there a way to "reformat" the drive (low level, perhaps) so that
it maps out the appropriate replacement table for the darn fsbn? Or
how about a way of adding bad blocks to it?
Any help in solving this would be appreciated !!
Regards,
Lew Payne
---
Lew Payne Publishing, Inc. Dunn & Bradstreet listed
994 San Antonio Road DUNS # 055037852
Palo Alto, CA 94303
To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message