But that's exactly the problem Richard:  AFAIK.

Can you state that absolutely, categorically, there is no failure mode out 
there (caused by hardware faults, or bad drivers) that won't lock a drive up 
for hours?  You can't, obviously, which is why we keep saying that ZFS should 
have this kind of timeout feature.

For once I agree with Miles, I think he's written a really good writeup of the 
problem here.  My simple view on it would be this:

Drives are only aware of themselves as an individual entity.  Their job is to 
save & restore data to themselves, and drivers are written to minimise any 
chance of data loss.  So when a drive starts to fail, it makes complete sense 
for the driver and hardware to be very, very thorough about trying to read or 
write that data, and to only fail as a last resort.

I'm not at all surprised that drives take 30 seconds to timeout, nor that they 
could slow a pool for hours.  That's their job.  They know nothing else about 
the storage, they just have to do their level best to do as they're told, and 
will only fail if they absolutely can't store the data.

The raid controller on the other hand (Netapp / ZFS, etc) knows all about the 
pool.  It knows if you have half a dozen good drives online, it knows if there 
are hot spares available, and it *should* also know how quickly the drives 
under its care usually respond to requests.

ZFS is perfectly placed to spot when a drive is starting to fail, and to take 
the appropriate action to safeguard your data.  It has far more information 
available than a single drive ever will, and should be designed accordingly.

Expecting the firmware and drivers of individual drives to control the failure 
modes of your redundant pool is just crazy imo.  You're throwing away some of 
the biggest benefits of using multiple drives in the first place.
-- 
This message posted from opensolaris.org
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to