On Wed, May 17, 2006 at 03:22:34AM +1000, grant beattie wrote: > > what I find interesting is that the SCSI errors were continuous for 10 > minutes before I detached it, ZFS wasn't backing off at all. it was > flooding the VGA console quicker than the console could print it all > :) from what you said above, once per minute would have been more > desirable.
The "once per minute" is related to the frequency at which ZFS tries to reopen the device. Regardless, ZFS will try to issue I/O to the device whenever asked. If you believe the device is completely broken, the correct procedure (as documented in the ZFS Administration Guide), is to 'zpool offline' the device until you are able to repair it. > I wonder why, given that ZFS knew there was a problem with this disk, > that it wasn't marked FAULTED and the pool DEGRADED? This is the future enhancement that I described below. We need more sophisticated analysis than simply 'N errors = FAULTED', and that's what FMA provides. It will allow us to interact with larger fault management (such as correlating SCSI errors, identifying controller failure, and more). ZFS is a intentionally dumb. Each subsystem is responsible for reporting errors, but coordinated fault diagnosis has to happen at a higher level. > I don't know enough about the internals to know why SVM happily > offlined the device after a short burst of errors - that's certainly > more friendly and expected. is there any way I can get the same > failure mode with ZFS? Not currently. - Eric -- Eric Schrock, Solaris Kernel Development http://blogs.sun.com/eschrock _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss