On Thu, May 18, 2006 at 11:40:53PM -0600, Sanjay Nadkarni wrote: > Since it's not exactly clear what you did with SVM I am assuming the > following: > > You had a file system on top of the mirror and there was some I/O > occurring to the mirror. The *only* time, SVM puts a device into > maintenance is when we receive an EIO from the underlying device. So, > in case a write occurred to the mirror, then the write to the powered > off side failed (returned an EIO) and SVM kept going. Since all buffers > sent to sd/ssd are marked with B_FAILFAST, the driver timeouts are low > and the device is put into maintenance.
the test was the same in both the SVM and the ZFS case. constant reads from the mirror device, and unplugging the power. the read throughput during this test with ZFS drops to around 20% until the device is manually removed from the pool, after which point it returns to normal. > If I understand Eric correctly, ZFS attempts to see if the device is > really gone. However I am not quite sure what Eric means by: > > >We currently only detect device failure when the device "goes away". > > Perhaps the issue here that ldi_open is successful when it should n't > and therefore confusing ZFS. yes, that seems to be the case. it appears to be caused by the way the aac card deals with the disk going away - it offlines the disk, and the LUN is still presented, but it now has zero length. also, after a disk is offlined by the card, there does not seem to be a way to tell the card to rescan the bus, so it requires a reboot (though there is nothing that ZFS can do which would fix that). I believe it can be done with the "aaccli" program provided by Adaptec, but that doesn't work with the Solaris-provided aac driver. > Another way to check is perform the same test, without any I/O > occurring to the file system. Then run metastat -i (as root). This is > similar to scrub for the volumes. with no IO activity on the mirror, metastat -i does not detect that anything is wrong. with IO activity, SVM offlines the metadevice when it gets a fatal error from the device. grant. _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss