Eric D. Mudama wrote:
On Tue, Dec 29 at  9:16, Brad wrote:
The disk cost of a raidz pool of mirrors is identical to the disk cost
of raid10.

ZFS can't do a raidz of mirrors or a mirror of raidz. Members of a mirror or raidz[123] must be a fundamental device (i.e. file or drive)



"This winds up looking similar to RAID10 in layout, in that you're
striping across a lot of disks that each consists of a mirror, though
the checksumming rules are different. Performance should also be
similar, though it's possible RAID10 may give slightly better random
read performance at the expense of some data quality guarantees, since
I don't believe RAID10 normally validates checksums on returned data
if the device didn't return an error. In normal practice, RAID10 and
a pool of mirrored vdevs should benchmark against each other within
your margin of error."

That's interesting to know that with ZFS's implementation of raid10
it doesn't have checksumming built-in.

I don't believe I said this.  I am reasonably certain that all
zpool/zfs layouts validate checksums, even if built with no
redundancy.  The "RAID10-similar" layout in ZFS is an array of
mirrors, such that you build a bunch of 2-device mirrored vdevs, and
add them all into a single pool.  You wind up with a layout like:


Yes. PLEASE be careful - checksumming and redundancy are DIFFERENT concepts.

In ZFS, EVERYTHING is checksummed - the data blocks, and the metadata. This is separate from redundancy. Regardless of the zpool layout (mirrors, raidz, or no redundancy), ZFS stores a checksum of all objects - this checksum is used to determine if the object has been corrupted. This check is done on any /read/

Should the checksum determine that the object is corrupt, then there are two things that can happen: if your zpool has some form of redundancy for that object, ZFS will then reread the object from the redundant side of the mirror, or reconstruct the data using parity. It will then re-write the object to another place in the zpool, and eliminate the "bad" object. Else, if there is no redundancy, then it will fail to return the data, and log an error message to the syslog.

In the case of metadata, even in a non-redundant zpool, some of that metadata is stored multiple times, so there is the possibility that you will be able to recover/reconstruct some metadata which fails checksumming.

In short, Checksumming is how ZFS /determines/ data corruption, and Redundancy is how ZFS /fixes/ it. Checksumming is /always/ present, while redundancy depends on the pool layout and options (cf. "copies" property).



--
Erik Trimble
Java System Support
Mailstop:  usca22-123
Phone:  x17195
Santa Clara, CA
Timezone: US/Pacific (GMT-0800)

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to