Hi Karsten. Adam, List,

Adam Leventhal wrote:

>Very interesting data. Your test is inherently single-threaded so I'm not 
>surprised that the benefits aren't more impressive -- the flash modules on the 
>F20 card are optimized more for concurrent IOPS than single-threaded latency.

Well, I actually wanted to do a bit more bottleneck searching, but let me weigh 
in with some measurements of our own :)

We're om a single X4540 with quad-core CPUs so we're on the older 
hypertransport bus. Connected it up to two X2200-s running Centos 5, each on 
its own 1Gb link. Switched write caching off with the following addition to the 
/kernel/drv/sd.conf file (Karsten: if you didn't do this already, you _really_ 
want to :) ):

# 
http://www.solarisinternals.com/wiki/index.php/ZFS_Evil_Tuning_Guide#Cache_Flushes
# Add whitespace to make the vendor ID (VID) 8 ... and Product ID (PID) 16 
characters long...
sd-config-list = "ATA     MARVELL SD88SA02","cache-nonvolatile";
cache-nonvolatile=1, 0x40000,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1;

As test we've found that untarring an eclipse sourcetar file is a good use 
case. So we use that. Called from a shell script that creates a directory, 
pushes directory and does the unpacking, for 40 times on each machine.

Now for the interesting bit: 

When we use one vmod, both machines are finished in about 6min45, zilstat maxes 
out at about 4200 IOPS.       
Using four vmods it takes about 6min55, zilstat maxes out at 2200 IOPS.

In both cases, probing the hyper transport bus seems to show no bottleneck 
there (although I'd like to see the biderectional flow, but I know we can't :) 
). Network stays comfortably under the 400Mbits/s and that's peak load when 
using 1 vmod.

Looking at the IO-connection architecture, it figures that in this set we 
traverse the different HT busses quite a lot. So we've also placed an Intel 
dual 1Gb NIC in another PCIE slot, so that ZIL traffic should only have to use 
1 HT bus (not counting offloading intelligence). That helped a bit, but not 
much:

Around 6min35 using one vmod and 6min45 using four vmod-s.

It made looking at the HT-dtrace more telling though. Since the outgoing HT-bus 
to the F20 (and the e1000-s) is now, expectedly, a better indication of the ZIL 
traffic.

We didn't do the 40 x 2 untar test whilst not using a SSD device. As an 
indication: unpacking a single tarbal then takes about 1min30. 

In case it means anything, single tarbal unpack no_zil, 1vmod, 1vmod_Intel, 
4vmod-s, 4vmod_Intel measures around (decimals only used as indication!):
                                                                                
    4s,     12s,            11.2s,      12.5s, 11.6s


Taking this all in account, I still don't see what's holding it up. 
Interestingly enough, the client side times are close within about 10 secs, but 
zilstat shows something different. Hypothesis: Zilstat shows only one vmod 
andwere capped in a layer above the ZIL? Can't rule out networking just yet, 
but my gut tells me we're not network bound here. That leaves the ZFS ZPL/VFS 
layer?  

I'm very open to suggestions on how to proceed... :)

With kind regards,

Jeroen
--
Jeroen Roodhart
ICT Consultant
                                        University of Amsterdam
j.r.roodhart uva.nl          Informatiseringscentrum
                                        Technical support/ATG
--
See http://www.science.uva.nl/~jeroen for openPGP public key
-- 
This message posted from opensolaris.org
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to