On Tue, 14 Jul 2009, Ross wrote:

My guess is something like it's single threaded, with each file dealt with in order and requests being serviced by just one or two disks at a time. With that being the case, an x4500 is essentially just running off 7200 rpm SATA drives, which really is nothing special.

Keep in mind that there is supposed to be file level read-ahead. As an example, ZFS is able to read from my array at up to 551 MB/second when reading from a huge (64GB) file yet it is only managing 145MB/second or so for these 8MB files sequentially accessed by cpio. This suggests that even for the initial read case that zfs is not applying enough file level read-ahead (or applying it soon enough) to keep the disks busy. 8MB is still pretty big in the world of files. Perhaps it takes zfs a long time to decide that read-ahead is required.

I have yet to find a tunable for file level read-ahead. There are tunables for vdev-level read-ahead but vdev read-ahead pretty minor read-ahead and increasing it may cause more harm than help.

A quick summary of some of the figures, with times normalized for 3000 files:

Sun x2200, single 500GB sata:   6m25.15s
Sun v490, raidz1 zpool of 6x146 sas drives on a j4200:  2m46.29s
Sun X4500, 7 sets of mirrored 500Gb SATA:  3m0.83s
Sun x4540, (unknown pool - Jorgen, what are you running?):   4m7.13s

And mine:

Ultra 40-M2 / StorageTek 2540, 6 sets of mirrored 300GB SAS: 2m44.20s

I think that Jorgen implied that his system is using SAN storage with a mirror across two jumbo LUNs.

The raid pool of SAS drives is quicker again, but for a single threaded request that also seems about right. The random read benefits of the mirror aren't going to take effect unless you run multiple reads in parallel. What I suspect is helping here are the slightly better seek times of the SAS drives, along with slightly higher throughput due to the raid.

Once ZFS decides to apply file level read-ahead then it can issue many reads in parallel. It should be able to keep at least six disks busy at once, leading to much better performance than we are seeing.

Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,    http://www.GraphicsMagick.org/
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to