On Fri, Nov 28, 2008 at 5:05 AM, Richard Elling <[EMAIL PROTECTED]> wrote: > Ross wrote: >> >> Well, you're not alone in wanting to use ZFS and iSCSI like that, and in >> fact my change request suggested that this is exactly one of the things that >> could be addressed: >> >> "The idea is really a two stage RFE, since just the first part would have >> benefits. The key is to improve ZFS availability, without affecting it's >> flexibility, bringing it on par with traditional raid controllers. >> >> A. Track response times, allowing for lop sided mirrors, and better >> failure detection. > > I've never seen a study which shows, categorically, that disk or network > failures are preceded by significant latency changes. How do we get > "better failure detection" from such measurements?
Not preceded by as such, but a disk or network failure will certainly cause significant latency changes. If the hardware is down, there's going to be a sudden, and very large change in latency. Sure, FMA will catch most cases, but we've already shown that there are some cases where it doesn't work too well (and I would argue that's always going to be possible when you are relying on so many different types of driver). This is there to ensure that ZFS can handle *all* cases. >> Many people have requested this since it would facilitate remote live >> mirrors. >> > > At a minimum, something like VxVM's preferred plex should be reasonably > easy to implement. > >> B. Use response times to timeout devices, dropping them to an interim >> failure mode while waiting for the official result from the driver. This >> would prevent redundant pools hanging when waiting for a single device." >> > > I don't see how this could work except for mirrored pools. Would that > carry enough market to be worthwhile? > -- richard I have to admit, I've not tested this with a raided pool, but since all ZFS commands hung when my iSCSI device went offline, I assumed that you would get the same effect of the pool hanging if a raid-z2 pool is waiting for a response from a device. Mirrored pools do work particularly well with this since it gives you the potential to have remote mirrors of your data, but if you had a raid-z2 pool, you still wouldn't want that hanging if a single device failed. I will go and test the raid scenario though on a current build, just to be sure. _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss