Re: [zfs-discuss] Supermicro AOC-SAT2-MV8 hang when drive removed

Bob Friesenhahn Wed, 30 Jul 2008 07:49:49 -0700

On Wed, 30 Jul 2008, Ross wrote:
>
> Imagine you had a raid-z array and pulled a drive as I'm doing here. 
> Because ZFS isn't aware of the removal it keeps writing to that 
> drive as if it's valid.  That means ZFS still believes the array is 
> online when in fact it should be degrated.  If any other drive now 
> fails, ZFS will consider the status degrated instead of faulted, and 
> will continue writing data.  The problem is, ZFS is writing some of 
> that data to a drive which doesn't exist, meaning all that data will 
> be lost on reboot.


While I do believe that device drivers. or the fault system, should 
notify ZFS when a device fails (and ZFS should appropriately react), I 
don't think that ZFS should be responsible for fault monitoring.  ZFS 
is in a rather poor position for device fault monitoring, and if it 
attempts to do so then it will be slow and may misbehave in other 
ways.  The software which communicates with the device (i.e. the 
device driver) is in the best position to monitor the device.

The primary goal of ZFS is to be able to correctly read data which was 
successfully committed to disk.  There are programming interfaces 
(e.g. fsync(), msync()) which may be used to ensure that data is 
committed to disk, and which should return an error if there is a 
problem.  If you were performing your tests over an NFS mount then the 
results should be considerably different since NFS requests that its 
data be committed to disk.

Bob
======================================
Bob Friesenhahn
[EMAIL PROTECTED], http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,    http://www.GraphicsMagick.org/

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Supermicro AOC-SAT2-MV8 hang when drive removed

Reply via email to