> From the ZFS Administration Guide, Chapter 11, Data Repair section:
> Given that the fsck utility is designed to repair known pathologies
> specific to individual file systems, writing such a utility for a file
> system with no known pathologies is impossible.

That's a fallacy (and is incorrect even for the UFS fsck; refer to the 
McKusick/Kowalski paper and the distinction they make between 'expected' 
corruptions and other inconsistencies).

First, there are two types of utilities which might be useful in the situation 
where a ZFS pool has become corrupted. The first is a file system checking 
utility (call it zfsck); the second is a data recovery utility. The difference 
between those is that the first tries to bring the pool (or file system) back 
to a usable state, while the second simply tries to recover the files to a new 
location.

What does a file system check do?  It verifies that a file system is internally 
consistent, and makes it consistent if it is not.  If ZFS were always 
consistent on disk, then only a verification would be needed.  Since we have 
evidence that it is not always consistent in the face of hardware failures, at 
least, repair may also be needed.  This doesn't need to be that hard.  For 
instance, the space maps can be reconstructed by walking the various block 
trees; the uberblock effectively has several backups (though it might be better 
in some cases if an older backup were retained); and the ZFS checksums make it 
easy to identify block types and detect bad pointers. Files can be marked as 
damaged if they contain pointers to bad data; directories can be repaired if 
their hash structures are damaged (as long as the names and pointers can be 
salvaged); etc.  Much more complex file systems than ZFS have file system 
checking utilities, because journaling, COW, etc. don't help you in the
  face of software bugs or certain classes of hardware failures.

A recovery tool is even simpler, because all it needs to do is find a tree root 
and then walk the file system, discovering directories and files, verifying 
that each of them is readable by using the checksums to check intermediate and 
leaf blocks, and extracting the data.  The tricky bit with ZFS is simply 
identifying a relatively new root, so that the newest copy of the data can be 
identified.

Almost every file system starts out without an fsck utility, and implements one 
once it becomes obvious that "sorry, you have to reinitialize the file system" 
-- or worse, "sorry, we lost all of your data" -- is unacceptable to a certain 
proportion of customers.
 
 
This message posted from opensolaris.org
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to