On Tue, 14 Jul 2009, Ross wrote:
My guess is something like it's single threaded, with each file
dealt with in order and requests being serviced by just one or two
disks at a time. With that being the case, an x4500 is essentially
just running off 7200 rpm SATA drives, which really is nothing
special.
Keep in mind that there is supposed to be file level read-ahead. As
an example, ZFS is able to read from my array at up to 551 MB/second
when reading from a huge (64GB) file yet it is only managing
145MB/second or so for these 8MB files sequentially accessed by cpio.
This suggests that even for the initial read case that zfs is not
applying enough file level read-ahead (or applying it soon enough) to
keep the disks busy. 8MB is still pretty big in the world of files.
Perhaps it takes zfs a long time to decide that read-ahead is
required.
I have yet to find a tunable for file level read-ahead. There are
tunables for vdev-level read-ahead but vdev read-ahead pretty minor
read-ahead and increasing it may cause more harm than help.
A quick summary of some of the figures, with times normalized for 3000 files:
Sun x2200, single 500GB sata: 6m25.15s
Sun v490, raidz1 zpool of 6x146 sas drives on a j4200: 2m46.29s
Sun X4500, 7 sets of mirrored 500Gb SATA: 3m0.83s
Sun x4540, (unknown pool - Jorgen, what are you running?): 4m7.13s
And mine:
Ultra 40-M2 / StorageTek 2540, 6 sets of mirrored 300GB SAS: 2m44.20s
I think that Jorgen implied that his system is using SAN storage with
a mirror across two jumbo LUNs.
The raid pool of SAS drives is quicker again, but for a single
threaded request that also seems about right. The random read
benefits of the mirror aren't going to take effect unless you run
multiple reads in parallel. What I suspect is helping here are the
slightly better seek times of the SAS drives, along with slightly
higher throughput due to the raid.
Once ZFS decides to apply file level read-ahead then it can issue many
reads in parallel. It should be able to keep at least six disks busy
at once, leading to much better performance than we are seeing.
Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer, http://www.GraphicsMagick.org/
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss