Thanks for the reply.

ad4: hard error reading fsbn 242727552

The error means that that the disk said that there was an error trying to read this block. You say that when you rebooted that the controler said a disk had gone bad, so this would sort of confirm this. (I could believe that restarting mountd might upset raid stuff if there were a kernel bug, but it seems very unlikely it could cause a disk to go bad.)

The full error was something like this on _both_ of the identical systems, even _before_ the reboot. After this message we could not read/write/fsck /dev/ar0


ad7: hard error reading fsbn 291786506 of 0-127 (ad7 bn 291786506; cn 289470 tn 11 sn 53) trying PIO
mode
ad7: DMA problem fallback to PIO mode
ad7: DMA problem fallback to PIO mode
ad7: DMA problem fallback to PIO mode
ad7: DMA problem fallback to PIO mode
ad7: DMA problem fallback to PIO mode
ad7: hard error reading fsbn 291786586 of 0-127 (ad7 bn 291786586; cn 289470 tn 13 sn 7) status=59 e
rror=40
ar0: ERROR - array broken


There was also a variety of messages like these:
Jul 14 02:55:39 thorimage1 /kernel: ad7: hard error reading fsbn 291786586 of 0-127 (ad7 bn 291786586; cn 289470 tn 13 sn 7) status=59 error=40


where ad7: .... included any of the 6 devices, somewhat randomly, in the array.


My best guess would be that you have a bad batch of disks that happen to have failed in similar ways. It is possible that restarting mountd uncovered the errors, 'cos I think mountd internally does a remount of the filesystem in question and that might cause a chunk of stuff to be flushed out on to the disk, highlighting an error.

(I had a bunch of the IBM "deathstar" disks fail on me within the
space of a week or so, after they'd been in use for about six
months.

That certainly sounds reasonable that this problem had just manifested itself by restarting mountd. It's just strange and too much of a coincidence that two sets of six disks on two different but identical machines would fail exactly the same way within an hour. I guess given the decline of quality in hard drives things like this might be more likely.


Thanks,
Sumit

_______________________________________________
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Reply via email to