[zfs-discuss] ZFS-Performance: Raid-Z vs. Raid5/6 vs. mirrored

Ralf Bertling Sun, 22 Jun 2008 01:44:37 -0700

Hi list,

as this matter pops up every now and then in posts on this list I justwant to clarify that the real performance of RaidZ (in its currentimplementation) is NOT anything that follows from raidz-style dataefficient redundancy or the copy-on-write design used in ZFS.

In a M-Way mirrored setup of N disks you get the write performance ofthe worst disk and a read performance that is the sum of all disks(for streaming and random workloads, while latency is not improved)Apart from the write performance you get very bad disk utilizationfrom that scenario.

In Raid-Z currently we have to distinguish random reads from streamingreads:- Write performance (with COW) is (N-M)*worst single disk writeperformance since all writes are streaming writes by design of ZFS(which is N-M-1 times faste than mirrored)- Streaming read performance is N*worst read performance of a singledisk (which is identical to mirrored if all disks have the same speed)- The problem with the current implementation is that N-M disks in avdev are currently taking part in reading a single byte from a it,which i turn results in the slowest performance of N-M disks inquestion.

Now lets see if this really has to be this way (this implies no,doesn't it ;-)When reading small blocks of data (as opposed to streams discussedearlier) the requested data resides on a single disk and thus readingit does not require to send read commands to all disks in the vdev.Without detailed knowledge of the ZFS code, I suspect the problem isthe logical block size of any ZFS operation always uses the fullstripe. If true, I think this is a design error.Without that, random reads to a raid-z are almost as fast as mirroreddata.The theoretical disadvantages come from disks that have differentspeed (probably insignificant in any real-life scenario) and thestatistical probability that by chance a few particular random readsdo in fact have to access the same disk drive to be fulfilled. (In amirrored setup, ZFS can choose from all idle devices, whereas in RAID-Z it has to wait for the disk that holds the data to be readyprocessing its current requests).Looking more closely, this effect mostly affects latency (notperformance) as random read-requests coming in should be distributedequally across all devices even bette if the queue of requests getslonger (this would however require ZFS to reorder requests for maximumperformance.

Since this seems to be a real issue for many ZFS users, it would benice if someone who has more time than me to look into the code, cancomment on the amount of work required to boost RaidZ read performance.

Doing so would level the tradeoff between read- write- performance anddisk utilization significantly.Obviously if disk space (and resulting electricity costs) do notmatter compared to getting maximum read performance, you will alwaysbe best of with 3 or even more way mirrors and a very large number ofvdevs in your pool.

A further question that springs to mind is if copies=N is also used toimprove read performance. If so, you could have some read-optimizedfilesystems in a pool while others use maximum storage efficiency (asfor backups).


Regards,
        ralf
--
Ralf Bertling

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

[zfs-discuss] ZFS-Performance: Raid-Z vs. Raid5/6 vs. mirrored

Reply via email to