Re: [zfs-discuss] "ZFS, Smashing Baby" a fake???

Richard Elling Mon, 24 Nov 2008 20:49:19 -0800

Scara Maccai wrote:
>> In the worst case, the device would be selectable,
>> but not responding
>> to data requests which would lead through the device
>> retry logic and can
>> take minutes.
>>     
>
> that's what I didn't know: that a driver could take minutes (hours???) to 
> decide that a device is not working anymore.
>


For Solaris, sd driver, there are, by default, 60 second timeouts with 5
retries.  For ssd driver, 3 retries.  But sometimes, additional tests are
made to try to verify that the disk is really not working properly which
will cause more of these.  Again, it depends on the failure mode.

> Now it comes another question: how can one assume that a drive failure won't 
> take one hour to be acknowledged by the driver? That is: what good is a 
> failover strategy if it takes one hour to start? I'm grateful that the system 
> doesn't write until it knows what is going on, but that can't take that long.
>   

AFAIK, there are no cases where the timeouts would result in an hour
delay before making a decision.  Usually, the policy is made in advance,
as in the zpool failmode property.
 -- richard


_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] "ZFS, Smashing Baby" a fake???

Reply via email to