> Fsck can only repair known faults; known > discrepancies in the meta data. > Since ZFS doesn't have such known discrepancies, > there's nothing to repair.
I'm rather tired of hearing this mantra. If ZFS detects an error in part of its data structures, then there is clearly something to repair. The choice ZFS presently makes is effectively to prune the entire pool hierarchy from the point of error downward. If the error found is near the root of the pool, this renders all files inaccessible. This is rather as if fsck, when finding a corrupted UFS directory, removed all of the files within it instead of either (a) trying to repair the directory, or (b) placing them in lost+found; or, when it found a doubly-allocated block, chose to reformat the filesystem. ZFS could do *much* better here both in on-line and off-line operation. It's misdirection to say that, because ZFS is intended to keep its pool always consistent, there are no inconsistencies possible, and no way to repair them. Almost every file system has adopted journaling for at least its metadata, which is a time-honored way to keep consistency; but almost every file system has a repair utility for when the journal is damaged or the file system is damaged in some other way. I haven't heard of a NetApp box (with its tree-structured WAFL system) suddenly making all of its data permanently inaccessible because of a disk error or software bug, but I have heard of them requiring file system repair on rare occasions. I've described before a number of checks which ZFS could perform, and the repair operations possible. I'll add a couple more. ZFS could keep track of where its internal nodes are stored, perhaps using a bitmap journaled in a traditional way or perhaps using the ZIL; this would make recovery of individual files much easier in the event of total file system loss. ZFS could segregate data and metadata sufficiently to make it easy to identify its metadata, or use self-checksums in additional areas, which would allow much of a filesystem to be reconstructed even if top-level metadata were corrupted. Every file system needs a repair utility, even if the only expected use case is for the elephant tripping over the fibre cables. -- This message posted from opensolaris.org _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss