Hi Karsten. Adam, List, Adam Leventhal wrote:
>Very interesting data. Your test is inherently single-threaded so I'm not >surprised that the benefits aren't more impressive -- the flash modules on the >F20 card are optimized more for concurrent IOPS than single-threaded latency. Well, I actually wanted to do a bit more bottleneck searching, but let me weigh in with some measurements of our own :) We're om a single X4540 with quad-core CPUs so we're on the older hypertransport bus. Connected it up to two X2200-s running Centos 5, each on its own 1Gb link. Switched write caching off with the following addition to the /kernel/drv/sd.conf file (Karsten: if you didn't do this already, you _really_ want to :) ): # http://www.solarisinternals.com/wiki/index.php/ZFS_Evil_Tuning_Guide#Cache_Flushes # Add whitespace to make the vendor ID (VID) 8 ... and Product ID (PID) 16 characters long... sd-config-list = "ATA MARVELL SD88SA02","cache-nonvolatile"; cache-nonvolatile=1, 0x40000,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1; As test we've found that untarring an eclipse sourcetar file is a good use case. So we use that. Called from a shell script that creates a directory, pushes directory and does the unpacking, for 40 times on each machine. Now for the interesting bit: When we use one vmod, both machines are finished in about 6min45, zilstat maxes out at about 4200 IOPS. Using four vmods it takes about 6min55, zilstat maxes out at 2200 IOPS. In both cases, probing the hyper transport bus seems to show no bottleneck there (although I'd like to see the biderectional flow, but I know we can't :) ). Network stays comfortably under the 400Mbits/s and that's peak load when using 1 vmod. Looking at the IO-connection architecture, it figures that in this set we traverse the different HT busses quite a lot. So we've also placed an Intel dual 1Gb NIC in another PCIE slot, so that ZIL traffic should only have to use 1 HT bus (not counting offloading intelligence). That helped a bit, but not much: Around 6min35 using one vmod and 6min45 using four vmod-s. It made looking at the HT-dtrace more telling though. Since the outgoing HT-bus to the F20 (and the e1000-s) is now, expectedly, a better indication of the ZIL traffic. We didn't do the 40 x 2 untar test whilst not using a SSD device. As an indication: unpacking a single tarbal then takes about 1min30. In case it means anything, single tarbal unpack no_zil, 1vmod, 1vmod_Intel, 4vmod-s, 4vmod_Intel measures around (decimals only used as indication!): 4s, 12s, 11.2s, 12.5s, 11.6s Taking this all in account, I still don't see what's holding it up. Interestingly enough, the client side times are close within about 10 secs, but zilstat shows something different. Hypothesis: Zilstat shows only one vmod andwere capped in a layer above the ZIL? Can't rule out networking just yet, but my gut tells me we're not network bound here. That leaves the ZFS ZPL/VFS layer? I'm very open to suggestions on how to proceed... :) With kind regards, Jeroen -- Jeroen Roodhart ICT Consultant University of Amsterdam j.r.roodhart uva.nl Informatiseringscentrum Technical support/ATG -- See http://www.science.uva.nl/~jeroen for openPGP public key -- This message posted from opensolaris.org _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss