Re: [zfs-discuss] APPLE: ZFS need bug corrections instead of new func! Or?

Haudy Kazemi Fri, 19 Jun 2009 21:51:47 -0700

I think a better question would be: what kind of tests would be most
promising for turning some subclass of these lost pools reported on
the mailing list into an actionable bug?


my first bet would be writing tools that test for ignored sync cache
commands leading to lost writes, and apply them to the case when iSCSI
targets are rebooted but the initiator isn't.

I think in the process of writing the tool you'll immediately bump
into a defect, because you'll realize there is no equivalent of a
'hard' iSCSI mount like there is in NFS.  and there cannot be a strict
equivalent to 'hard' mounts in iSCSI, because we want zpool redundancy
to preserve availability when an iSCSI target goes away.  I think the
whole model is wrong somehow.

I'd surely hope that a ZFS pool with redundancy built on iSCSI targetscould survive the loss of some targets whether due to actual failures ornecessary upgrades to the iSCSI targets (think OS upgrades + reboots onthe systems that are offering iSCSI devices to the network.)

My suggestion is use multi-way redundancy with iSCSI...e.g. 3 waymirrors or RAIDZ2...so that you can safely offline one of the iSCSItargets while still leaving the pool with some redundancy. Sure thereis an increased risk while that device is offline, but the window ofopportunity is small for a failure of the 2nd level redundancy; and eventhen nothing is yet lost until a 3rd device has a fault. Failuresshould also distinguish between complete failure (e.g. device no longerresponds to commands whatsoever) and intermittent failure (e.g. a"sticky" patch of sectors, or the drive stops responding for a minutebecause it has a non-changeable TLER value that otherwise may causetrouble in a RAID configuration). Drives have a gradation from completefailure to flaky to flawless...if the software running on themrecognizes this, better decisions can be made about what to do when anerror is encountered rather than the simplistic good/failed model thathas been used in RAIDs for years.

My preference for storage behavior is that it should not cause a systempanic (ever). Graceful error recovery techniques are important. Filesystem error messages should be passed up the line when possible so theuser can figure out something is amiss with some files (even if not all)even though the sysadmin is not around or email notification of problemsis not working. If it is possible to returning a CRC errors to anetwork share client, that would seem to be a close match to auncorrectable checksum failure. (Windows throws these errors when itcannot read a CD/DVD.)

A good damage mitigation feature is to provide some mechanism to allow auser to ignore the checksum failure as in many user data cases partialrecovery is preferable to no recovery. To ensure that damaged files arenot accidentally confused with good files, ignoring the checksumfailures might only be allowed through a special "recovery filesystem"that only lists damaged files the authenticated user has access to.From the network client's perspective, this would be another sharedfolder/subfolder that is only present when uncorrectable, damaged fileshave been found. ZFS would set up the appropriate links to replicatethe directory structure of the original as needed to include the damagedfile.


_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] APPLE: ZFS need bug corrections instead of new func! Or?

Reply via email to