On Wed, 30 Jul 2008, Ross wrote: > > Imagine you had a raid-z array and pulled a drive as I'm doing here. > Because ZFS isn't aware of the removal it keeps writing to that > drive as if it's valid. That means ZFS still believes the array is > online when in fact it should be degrated. If any other drive now > fails, ZFS will consider the status degrated instead of faulted, and > will continue writing data. The problem is, ZFS is writing some of > that data to a drive which doesn't exist, meaning all that data will > be lost on reboot.
While I do believe that device drivers. or the fault system, should notify ZFS when a device fails (and ZFS should appropriately react), I don't think that ZFS should be responsible for fault monitoring. ZFS is in a rather poor position for device fault monitoring, and if it attempts to do so then it will be slow and may misbehave in other ways. The software which communicates with the device (i.e. the device driver) is in the best position to monitor the device. The primary goal of ZFS is to be able to correctly read data which was successfully committed to disk. There are programming interfaces (e.g. fsync(), msync()) which may be used to ensure that data is committed to disk, and which should return an error if there is a problem. If you were performing your tests over an NFS mount then the results should be considerably different since NFS requests that its data be committed to disk. Bob ====================================== Bob Friesenhahn [EMAIL PROTECTED], http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer, http://www.GraphicsMagick.org/ _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss