On Jul 25, 2009, at 15:32, Frank Middleton wrote:
Can you comment on if/how mirroring or raidz mitigates this, or tree corruption in general? I have yet to lose a pool even on a machine with fairly pathological problems, but it is mirrored (and copies=2).
Presumably at least on of the drives in the mirror or RAID set would have the correct data or non-corrupted data structures.
There was a thread a while back on the risks involved in a SAN LUN (served from something like an EMC array), and whether you could trust the array or whether you should mirror LUNs. (I think the consensus was it was best to mirror LUNs--even from SANs, which presumably are more reliable than consumer SATA drives).
I was also wondering if you could explain why the ZIL can't repair such damage.
Beyond my knowledge.
Finally, a number of posters blamed VB for ignoring a flush, but according to the evil tuning guide, without any application syncs, ZFS may wait up to 5 seconds before issuing a synch, and there
Yes, it will sync every 5 to 30 seconds, but how do you know the data is actually synced?! If the five second timer triggers and ZFS says "okay, time to sync", and goes through the proper procedures, what happens if the drive lies about the sync operation? What then?
That's the whole point of this thread: what should happen, or what should the file system do, when the drive (real or virtual) lies about the syncing? It's just as much a problem with any other POSIX file system (which have to deal with fsync(2))--ZFS isn't that special in that regard. The Linux folks went through a protracted debate on a similar issue not too long ago:
http://thunk.org/tytso/blog/2009/03/15/dont-fear-the-fsync/ http://lwn.net/Articles/322823/
tripping over a patch cord or a router blowing a fuse. Doesn't this mean /any/ hardware might have this problem, albeit with much lower probability?
Yes, which is why it's always recommended to have redundancy in your configuration (mirroring or RAID-Z). This way, hopefully, at least one drive is in a consistent state.
This is also (theoretically) why a drive purchased from Sun is more that expensive then a drive purchased from your neighbourhood computer shop: Sun (and presumably other manufacturers) takes the time and effort to test things to make sure that when a drive says "I've synced the data", it actually has synced the data. This testing is what you're presumably paying for.
_______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss