[EMAIL PROTECTED] wrote on 01/26/2007 01:43:35 PM:
> On Fri, Jan 26, 2007 at 11:05:17AM -0800, Ed Gould wrote:
> > On Jan 26, 2007, at 9:42, Gary Mills wrote:
> > >How does this work in an environment with storage that's centrally-
> > >managed and shared between many servers?
> >
> > It will work, but if the storage system corrupts the data, ZFS will be
> > unable to correct it. It will detect the error.
> >
> > A number that I've been quoting, albeit without a good reference, comes
> > from Jim Gray, who has been around the data-management industry for
> > longer than I have (and I've been in this business since 1970); he's
> > currently at Microsoft. Jim says that the controller/drive subsystem
> > writes data to the wrong sector of the drive without notice about once
> > per drive per year. In a 400-drive array, that's once a day. ZFS will
> > detect this error when the file is read (one of the blocks' checksum
> > will not match). But it can only correct the error if it manages the
> > redundancy.
>
> Our Netapp does double-parity RAID. In fact, the filesystem design is
> remarkably similar to that of ZFS. Wouldn't that also detect the
> error? I suppose it depends if the `wrong sector without notice'
> error is repeated each time. Or is it random?
I do not know, WAFL and other portions of NetApp backends are never really
described in very technical details -- even getting real IOPS numbers from
them seems to be a hassle, much magic -- little meat. To me, zfs is very
well defined behavior and methodology (you can even see the source to
verify specifics) and this allows you to _know_ what weak points are.
NetApp, EMC and other disk vendors may have financial benefits for
allowing edge cases such as the write hole or bit rot (x errors per disk
are acceptable losses, after x errors then consider replacing disk
cost/benefit analysis -- will customers actually know a bit is flipped?).
In EMC's case it is very common for a disk to have multiple read/write
errors before EMC will swap out the disk, they even use a substantial
portion of the disk as replacement and parity bits (outside of raid) so
they offset or postpone the replacement volume/costs on the customer.
The most detailed description of WAFL I was able to find last time I looked
was:
http://www.netapp.com/library/tr/3002.pdf
>
> > I would suggest exporting two LUNs from your central storage and let
> > ZFS mirror them. You can get a wider range of space/performance
> > tradeoffs if you give ZFS a JBOD, but that doesn't sound like an
> > option.
>
> That would double the amount of disk that we'd require. I am actually
> planning on using two iSCSI LUNs and letting ZFS stripe across them.
> When we need to expand the ZFS pool, I'd like to just expand the two
> LUNs on the Netapp. If ZFS won't accomodate that, I can just add a
> couple more LUNs. This is all convenient and easily managable.
If you do have bit errors coming from the netapp zfs will find them and
will not be able to correct in this case.
>
> --
> -Gary Mills- -Unix Support- -U of M Academic Computing and
Networking-
> _______________________________________________
> zfs-discuss mailing list
> zfs-discuss@opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss