Anton B. Rang wrote:
And to panic? How can that in any sane way be good
way to "protect" the application?
*BANG* - no chance at all for the application to
handle the problem...

I agree -- a disk error should never be fatal to the system; at worst, the file system should appear to have been forcibly unmounted (and "worst" really means that critical metadata, like the superblock/uberblock, can't be updated on any of the disks in the pool). That at least gives other applications which aren't using the file system the chance to keep going.

This is not always the desired behavior.  In particular, for a high availability
cluster, if one node is having difficulty and another is not, then we'd really
like to have the services relocated to the good node ASAP.  I think this case is
different, though...

An I/O error detected when writing a file can be reported at write() time, fsync() time, or close() time. Any application which doesn't check all three of these won't handle all I/O errors properly; and applications which care about knowing that their data is on disk must either use synchronous writes (O_SYNC/O_DSYNC) or call fsync before closing the file. ZFS should report back these errors in all cases and avoid panicing (obviously).

From what I recall of previous discussions on this topic (search the archives),
the difficulty is attributing a failure temporally, given that you want a file
system to have better performance by caching.

That said, it also appears that the device drivers (either the FibreChannel or SCSI disk drivers in this case) are misbehaving. The FC driver appears to be reporting back an error which is interpreted as fatal by the SCSI disk driver when one or the other should be retrying the I/O. (It also appears that either the FC driver, SCSI disk driver, or ZFS is misbehaving in the observed hang.)

Agree 110%.  When debugging layered software/firmware, it is essential to 
understand
all of the assumptions made at all interfaces.  Currently, ZFS assumes that a 
fatal
write error is in fact fatal.

So ZFS should be more resilient against write errors, and the SCSI disk or FC drivers should be more resilient against LIPs (the most likely cause of your problem) or other transient errors. (Alternatively, the ifp driver should be updated to support the maximum number of targets on a loop, which might also solve your second problem.)

NB. LIPs are a normal part of everyday life for fibre channel, they are not an 
error.

But I think Anton is right here, the way that the driver deals with incurred
exceptions is key to the upper layers being stable.  This can be tuned, but
remember that tuning my lead to instability.  We might be dealing with an 
instability
case here, not a functional spec problem.
 -- richard
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to