> > > Everytime I have seen this issue (and it's been more than once - though > until now recoverable - even if extremely painful) - its always been > during a resilver of a failed drive and something happening... panic, > another drive failure, power etc.. any other time its rock solid... > which is the yes and no... under normal circumstances zfs is very very > good and seems as safe as or safer than UFS... but my experience is ZFS > has one really bad flaw.. if there is a corruption in the metadata - > even if the stored data is 100% correct - it will fault the pool and > thats it it's gone barring some luck and painful recovery (backups > aside) ... this other file systems also suffer but there are tools that > *majority of the time* will get you out of the s**t with little pain. > Barring this windows based tool I haven't been able to run yet, zfs > appears to have nothing. > > > This is the difference I see here. You keep says that all of the data drive is 100% correct, that is only the meta data on the drive that is incorrect/corrupted. How do you know this? Especially, how to you know before you recovered the data from the drive. As ZFS meta data is stored redundantly on the drive and never in an inconsistent form (that is what fsck does, it fixes the inconsistent data that most other filesystems store when they crash/have disk issues). If the meta data is corrupted, how would ZFS know what other correct (computers don't understand things, they just follow the numbers)? If the redundant copies of the meta data are corrupt, what are the odds that the file data is corrupt? In my experience, getting the meta data trashed and none of the file data trashed is a rare event on a system with multi-drive redundancy.
I have a friend/business partner that doesn't want to move to ZFS because his recovery method is wait for a single drive (no-redundancy, sometimes no backup) to fail and then use ddrescue to image the broken drive to a new drive (ignoring any file corruption because you can't really tell without ZFS). He's been using disk rescue programs for so long that he will not move to ZFS, because it doesn't have a disk rescue program. He has systems on Linux with ext3 and no mirroring or backups. I've asked about moving them to a mirrored ZFS system and he has told me that the customer doesn't want to pay for a second drive (but will pay for hours of his time to fix the problem when it happens). You kind of sound like him. ZFS is risky because there isn't a good drive rescue program. Sun's design was that the system should be redundant by default and checksum everything. If the drives fail, replace them. If they fail too much or too fast, restore from backup. Once the system had too much corruption, you can't recover/check for all the damage without a second off disk copy. If you have that off disk, then you have backup. They didn't build for the standard use case as found in PCs because the disk recover programs rarely get everything back, therefore they can't be relied on to get you data back when your data is important. Many PC owners have brought PC mindset ideas to the "UNIX" world. Sun's history predates Windows and Mac and comes from a Mini/Mainframe mindset (were people tried not to guess about data integrity). Would a disk rescue program for ZFS be a good idea? Sure. Should the lack of a disk recovery program stop you from using ZFS? No. If you think so, I suggest that you have your data integrity priorities in the wrong order (focusing on small, rare events rather than the common base case). Walter -- The greatest dangers to liberty lurk in insidious encroachment by men of zeal, well-meaning but without understanding. -- Justice Louis D. Brandeis _______________________________________________ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"