I've read in numerous threads that it's important to use ECC RAM in a ZFS file server.
My question is: is there any technical reason, in ZFS's design, that makes it particularly important for ZFS to require ECC RAM? Is ZFS especially vulnerable, moreso than other filesystems, to bit errors in RAM? For example, if the wrong bit flips at the wrong time, could I lose my entire RAID-Z pool instead of, say, corrupting one file's contents or metadata? Is there such a possibility? (Assume the rest of the hardware stack "behaves", eg an fsync to the drive won't return until the bytes are written to stable storage). I had assumed that a bit error from RAM would only have a localized effect (eg, corrupt the contents or metadata of file or directory) each time it "struck", but now I'm wondering if the failure could be global because of something in ZFS's design, and that's why the recommendation for ECC RAM is always so "strong". Some of the posts in this thread ("Another user loses his pool..."): http://opensolaris.org/jive/thread.jspa?threadID=108213&tstart=0 make me think ZFS may in fact "require" ECC RAM. -- This message posted from opensolaris.org _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss