I know there have been a bunch of discussion of various ZFS performance issues, but I did not see anything specifically on this. In testing a new configuration of an SE-3511 (SATA) array, I ran into an interesting ZFS performance issue. I do not believe that this is creating a major issue for our end users (but it may), but it is certainly impacting our nightly backups. I am only seeing 10-20 MB/sec per thread for random read throughput using iozone for testing. Here is the full config:
SF-V480 --- 4 x 1.2 GHz III+ --- 16 GB memory --- Solaris 10U6 with ZFS patch and IDR for snapshot / resilver bug. SE-3511 --- 12 x 500 GB SATA drives --- 11 disk R5 --- dual 2 Gbps FC host connection I have the ARC size limited to 1 GB so that I can test with a rational data set size. The total amount of data that I am testing with is 3 GB and a 256KB record size. I tested with 1 through 20 threads. With 1 thread I got the following results: sequential write: 112 MB/sec. sequential read: 221 MB/sec. random write: 96 MB/sec. random read: 18 MB/sec. As I scaled the number of threads (and kept the total data size the same) I got the following (throughput is in MB/sec): threads sw sr rw rr 2 105 218 93 34 4 106 219 88 52 8 95 189 69 92 16 71 153 76 128 As the number of threads climbs the first thee values drop once you get above 4 threads (one per CPU), but the fourth (random read) climbs well past 4 threads. It is just about linear through 9 threads and then it starts fluctuating, but continues climbing to at least 20 threads (I did not test past 20). Above 16 threads the random read even exceeds the sequential read values. Looking at iostat output for the LUN I am using for the 1 thread case, for the first three tests (sequential write, sequential read, random write) I see %b at 100 and actv climb to 35 and hang out there. For the random read test I see %b at 5 to 7, actv at less than 1 (usually around 0.5 to 0.6), wsvc_t is essentially 0, and asvc_t runs about 14. As the number of threads increases, the iostat values don't really change for the first three tests (sequential write and read), but they climb for the random read. The array is close to saturated at about 170 MB/sec. random read (18 threads), so I know that the 18 MB/sec. value for one thread is _not_limited by the array. I know the 3511 is not a high performance array, but we needed lots of bulk storage and could not afford better when we bought these 3 years ago. But, it seems to me that there is something wrong with the random read performance of ZFS. To test whether this is an effect of the 3511 I ran some tests on another system we have, as follows: T2000 --- 32 thread 1 GHz --- 32 GB memory --- Solaris 10U8 --- 4 Internal 72 GB SAS drives We have a zpool built of one slice on each of the 4 internal drives configured as a striped mirror layout (2 vdevs each of 2 slices). So I/O is spread over all 4 spindles. I started with 4 threads and 8 GB each (32 GB total to insure I got past the ARC, it is not tuned down on this system). I saw exactly the same ratio of sequential read to random read (the random read performance was 23% of the sequential read performance in both cases). Based on looking at iostat values during the test, I am saturating all four drives with the write operations with just 1 thread. The sequential read is saturating the drives with anything more than 1 thread, and the random read is not saturating the drives until I get to about 6 threads. threads sw sr rw rr 1 100 207 88 30 2 103 370 88 53 4 98 350 90 82 8 101 434 92 95 I confirmed that the problem is not unique to either 10U6 or the IDR, 10U8 has the same behavior. I confirmed that the problem is not unique to a FC attached disk array or the SE-3511 in particular. Then I went back and took another look at my original data (SF-V480/SE-3511) and looked at throughput per thread. For the sequential operations and the random write, the throughput per thread fell pretty far and pretty fast, but the per thread random read numbers fell very slowly. Per thread throughput in MB/sec. threads sw sr rw rr 1 112 221 96 18 2 53 109 46 17 4 26 55 22 13 8 12 24 9 12 16 5 10 5 8 So this makes me think that the random read performance issue is a limitation per thread. Does anyone have any idea why ZFS is not reading as fast as the underlying storage can handle in the case of random reads ? Or am I seeing an artifact of iozone itself ? Is there another benchmark I should be using ? P.S. I posted a OpenOffice.org spreadsheet of my test resulsts here: http://www.ilk.org/~ppk/Geek/throughput-summary.ods -- {--------1---------2---------3---------4---------5---------6---------7---------} Paul Kraus -> Senior Systems Architect, Garnet River ( http://www.garnetriver.com/ ) -> Sound Coordinator, Schenectady Light Opera Company ( http://www.sloctheater.org/ ) -> Technical Advisor, Lunacon 2010 (http://www.lunacon.org/) -> Technical Advisor, RPI Players _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss