Re: [zfs-discuss] repost - high read iops

Eric D. Mudama Tue, 29 Dec 2009 10:00:00 -0800

On Tue, Dec 29 at  9:16, Brad wrote:

@eric


"As a general rule of thumb, each vdev has the random performance
roughly the same as a single member of that vdev. Having six RAIDZ
vdevs in a pool should give roughly the performance as a stripe of six
bare drives, for random IO."

It sounds like we'll need 16 vdevs striped in a pool to at least get
the performance of 15 drives plus another 16 mirrored for redundancy.


If you were striping across 16 devices before, you will achieve
similar random IO performance by striping across 16 vdevs, regardless
of their type.  Sequential throughput is more a function of the number
of devices, not the number of vdevs, in that a 3-disk RAIDZ will have
the sequential write throughput (roughly) of a pair of drives.

You still get checksumming, but if a device fails or you get a
corruption in your non-redundant stripe, zfs may not have enough
information to repair your data.  For a read-only data reference,
maybe a restore from backup in these situations is okay, but for most
installations that is unacceptable.

The disk cost of a raidz pool of mirrors is identical to the disk cost
of raid10.

If we are bounded in iops by the vdev, would it make sense to go
with the bare minimum of drives (3) per vdev?


ZFS supports non-redundant vdev layouts, but they're generally not
recommended.  The smallest mirror you can build is 2 devices, and the
smallest raidz is 3 devices per vdev.

"This winds up looking similar to RAID10 in layout, in that you're
striping across a lot of disks that each consists of a mirror, though
the checksumming rules are different. Performance should also be
similar, though it's possible RAID10 may give slightly better random
read performance at the expense of some data quality guarantees, since
I don't believe RAID10 normally validates checksums on returned data
if the device didn't return an error. In normal practice, RAID10 and
a pool of mirrored vdevs should benchmark against each other within
your margin of error."

That's interesting to know that with ZFS's implementation of raid10
it doesn't have checksumming built-in.


I don't believe I said this.  I am reasonably certain that all
zpool/zfs layouts validate checksums, even if built with no
redundancy.  The "RAID10-similar" layout in ZFS is an array of
mirrors, such that you build a bunch of 2-device mirrored vdevs, and
add them all into a single pool.  You wind up with a layout like:

Pool0
  mirror-0
    disk0
    disk1
  mirror-1
    disk2
    disk3
  mirror-2
    disk4
    disk5
  ...
  mirror-N
    disk-2N
    disk-2N+1

This will give you the best random IO performance possible with ZFS,
independent of the type of disks used.  (Obviously some of the same
rules may not apply with ramdisks or SSDs, but those are special cases
for most.)

--eric


--
Eric D. Mudama
edmud...@mail.bounceswoosh.org

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] repost - high read iops

Reply via email to