Hi folks, I've got a configuration issue with Raidframe: Our gateway/firewall runs a raid1 for the system disk. No swap partition.
Recently one of the raid disks (wd0) showed some problem: Aug 2 17:22:35 fw01 /bsd: wd0(pciide0:0:0): timeout Aug 2 17:53:52 fw01 /bsd: type: ata Aug 2 17:53:52 fw01 /bsd: c_bcount: 16384 Aug 2 17:53:52 fw01 /bsd: c_skip: 0 Aug 2 17:53:52 fw01 /bsd: pciide0:0:0: bus-master DMA error: missing interrupt, status=0x21 Aug 2 17:53:52 fw01 /bsd: pciide0 channel 0: reset failed for drive 0 Aug 2 17:53:52 fw01 /bsd: wd0d: device timeout writing fsbn 46172704 of 46172704-46172735 (wd0 bn 50368000; cn 49968 tn 4 sn 4), retrying : : Aug 2 17:53:52 fw01 /bsd: wd0d: device timeout writing fsbn 46172704 of 46172704-46172735 (wd0 bn 50368000; cn 49968 tn 4 sn 4) Aug 2 17:53:52 fw01 /bsd: raid0: IO Error. Marking /dev/wd0d as failed. Aug 2 17:53:52 fw01 /bsd: raid0: node (Wpd) returned fail, rolling forward Aug 2 17:53:52 fw01 /bsd: pciide0:0:0: not ready, st=0xd0<BSY,DRDY,DSC>, err=0x00 Aug 2 17:53:52 fw01 /bsd: pciide0 channel 0: reset failed for drive 0 Aug 2 17:53:52 fw01 /bsd: wd0d: device timeout writing fsbn 46137472 of 46137472-46137503 (wd0 bn 50332768; cn 49933 tn 4 sn 52), retrying : : Aug 2 17:53:53 fw01 /bsd: pciide0:0:0: not ready, st=0xd0<BSY,DRDY,DSC>, err=0x00 Aug 2 17:53:53 fw01 /bsd: pciide0 channel 0: reset failed for drive 0 Aug 2 17:53:53 fw01 /bsd: wd0d: device timeout writing fsbn 46152320 of 46152320-46152343 (wd0 bn 50347616; cn 49948 tn 0 sn 32) Aug 2 17:53:53 fw01 /bsd: raid0: node (Wpd) returned fail, rolling forward Surely wd0 is defect. Can happen. But my problem is that the machine became unresponsive for 30 minutes. Even a ping did not work. This is not what I would expect from a raid system. What would you suggest to reduce the waiting time? 2 minutes would be OK, but 30 minutes downtime are a _huge_ problem. Do I have to expect the same for a raid5 built from 9 disks, but with a higher probability, because there are more disks in the loop? Regards Harri