As far as I can tell, it all comes down to whether ZFS detects the failure 
properly, and what commands you use as it's recovering.

Running "zpool status" is a complete no no if your array is degraded in any 
way.  This is capable of locking up zfs even when it would otherwise have 
recovered itself.  If you had zpool status hang, this probably happened to you.

It also appears that ZFS is at the mercy of your drivers when it comes to 
detecting and reacting to the failure.  From my experience this means that when 
a device does fail, ZFS may react instantly and keep your mirror online, it may 
take 3 minutes (waiting for iSCSI to timeout), or it may take a long time (if 
FMA is involved).

I've seen ZFS mirrors protect data nicely, but I've also seen a lot of very odd 
fail modes.  I'd quite happily run ZFS in production, but you can be damn sure 
it'd be on Sun hardware, and I'd test as many fail modes as I could before it 
went live.
This message posted from
zfs-discuss mailing list

Reply via email to