> Fsck can only repair known faults; known
> discrepancies in the meta data.
> Since ZFS doesn't have such known discrepancies,
> there's nothing to repair.

I'm rather tired of hearing this mantra.

If ZFS detects an error in part of its data structures, then there is clearly 
something to repair.

The choice ZFS presently makes is effectively to prune the entire pool 
hierarchy from the point of error downward. If the error found is near the root 
of the pool, this renders all files inaccessible.

This is rather as if fsck, when finding a corrupted UFS directory, removed all 
of the files within it instead of either (a) trying to repair the directory, or 
(b) placing them in lost+found; or, when it found a doubly-allocated block, 
chose to reformat the filesystem.

ZFS could do *much* better here both in on-line and off-line operation.  It's 
misdirection to say that, because ZFS is intended to keep its pool always 
consistent, there are no inconsistencies possible, and no way to repair them.  
Almost every file system has adopted journaling for at least its metadata, 
which is a time-honored way to keep consistency; but almost every file system 
has a repair utility for when the journal is damaged or the file system is 
damaged in some other way. I haven't heard of a NetApp box (with its 
tree-structured WAFL system) suddenly making all of its data permanently 
inaccessible because of a disk error or software bug, but I have heard of them 
requiring file system repair on rare occasions.

I've described before a number of checks which ZFS could perform, and the 
repair operations possible.  I'll add a couple more.  ZFS could keep track of 
where its internal nodes are stored, perhaps using a bitmap journaled in a 
traditional way or perhaps using the ZIL; this would make recovery of 
individual files much easier in the event of total file system loss.  ZFS could 
segregate data and metadata sufficiently to make it easy to identify its 
metadata, or use self-checksums in additional areas, which would allow much of 
a filesystem to be reconstructed even if top-level metadata were corrupted.

Every file system needs a repair utility, even if the only expected use case is 
for the elephant tripping over the fibre cables.
-- 
This message posted from opensolaris.org
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to