On Jul 4, 2009, at 12:03 AM, Bob Friesenhahn wrote:
% ./diskqual.sh
c1t0d0 130 MB/sec
c1t1d0 130 MB/sec
c2t202400A0B83A8A0Bd31 13422 MB/sec
c3t202500A0B83A8A0Bd31 13422 MB/sec
c4t600A0B80003A8A0B0000096A47B4559Ed0 191 MB/sec
c4t600A0B80003A8A0B0000096E47B456DAd0 192 MB/sec
c4t600A0B80003A8A0B0000096147B451BEd0 192 MB/sec
c4t600A0B80003A8A0B0000096647B453CEd0 192 MB/sec
c4t600A0B80003A8A0B0000097347B457D4d0 212 MB/sec
c4t600A0B800039C9B500000A9C47B4522Dd0 191 MB/sec
c4t600A0B800039C9B500000AA047B4529Bd0 192 MB/sec
c4t600A0B800039C9B500000AA447B4544Fd0 192 MB/sec
c4t600A0B800039C9B500000AA847B45605d0 191 MB/sec
c4t600A0B800039C9B500000AAC47B45739d0 191 MB/sec
c4t600A0B800039C9B500000AB047B457ADd0 191 MB/sec
c4t600A0B800039C9B500000AB447B4595Fd0 191 MB/sec
somehow i don't think that reading the first 64MB off (presumably) off
a raw disk device 3 times and picking the middle value is going to
give you much useful information on the overall state of the disks ..
i believe this was more of a quick hack to just validate that there's
nothing too far out of the norm, but with that said - what's the c2
and c3 device above? you've got to be caching the heck out of that to
get that unbelievable 13 GB/s - so you're really only seeing memory
speeds there
more useful information would be something more like the old taz or
some of the disk IO latency tools when you're driving a workload.
% arc_summary.pl
System Memory:
Physical RAM: 20470 MB
Free Memory : 2371 MB
LotsFree: 312 MB
ZFS Tunables (/etc/system):
* set zfs:zfs_arc_max = 0x300000000
set zfs:zfs_arc_max = 0x280000000
* set zfs:zfs_arc_max = 0x200000000
ARC Size:
Current Size: 9383 MB (arcsize)
Target Size (Adaptive): 10240 MB (c)
Min Size (Hard Limit): 1280 MB (zfs_arc_min)
Max Size (Hard Limit): 10240 MB (zfs_arc_max)
ARC Size Breakdown:
Most Recently Used Cache Size: 6% 644 MB (p)
Most Frequently Used Cache Size: 93% 9595 MB (c-p)
ARC Efficency:
Cache Access Total: 674638362
Cache Hit Ratio: 91% 615586988 [Defined State for
buffer]
Cache Miss Ratio: 8% 59051374 [Undefined State for
Buffer]
REAL Hit Ratio: 87% 590314508 [MRU/MFU Hits Only]
Data Demand Efficiency: 96%
Data Prefetch Efficiency: 7%
CACHE HITS BY CACHE LIST:
Anon: 2% 13626529 [ New
Customer, First Cache Hit ]
Most Recently Used: 78% 480379752 (mru) [ Return
Customer ]
Most Frequently Used: 17% 109934756 (mfu)
[ Frequent Customer ]
Most Recently Used Ghost: 0% 5180256 (mru_ghost) [ Return
Customer Evicted, Now Back ]
Most Frequently Used Ghost: 1% 6465695 (mfu_ghost) [ Frequent
Customer Evicted, Now Back ]
CACHE HITS BY DATA TYPE:
Demand Data: 78% 485431759
Prefetch Data: 0% 3045442
Demand Metadata: 16% 103900170
Prefetch Metadata: 3% 23209617
CACHE MISSES BY DATA TYPE:
Demand Data: 30% 18109355
Prefetch Data: 60% 35633374
Demand Metadata: 6% 3806177
Prefetch Metadata: 2% 1502468
---------------------------------------------
Prefetch seems to be performing badly. The Ben Rockwood's blog
entry at http://www.cuddletech.com/blog/pivot/entry.php?id=1040
discusses prefetch. The sample Dtrace script on that page only
shows cache misses:
vdev_cache_read: 6507827833451031357 read 131072 bytes at offset
6774849536: MISS
vdev_cache_read: 6507827833451031357 read 131072 bytes at offset
6774980608: MISS
Unfortunately, the file-level prefetch DTrace sample script from the
same page seems to have a syntax error.
if you're using LUNs off an array - this might be another case of the
zfs_vdev_max_pending being tuned more for direct attach drives .. you
could be trying to queue up too much I/O against the RAID controller,
particularly if the RAID controller is also trying to prefetch out of
it's cache.
I tried disabling file level prefetch (zfs_prefetch_disable=1) but
did not observe any change in behavior.
this is only going to help if you've got problems in zfetch .. you'd
probably see this better by looking for high lock contention in zfetch
with lockstat
# kstat -p zfs:0:vdev_cache_stats
zfs:0:vdev_cache_stats:class misc
zfs:0:vdev_cache_stats:crtime 130.61298275
zfs:0:vdev_cache_stats:delegations 754287
zfs:0:vdev_cache_stats:hits 3973496
zfs:0:vdev_cache_stats:misses 2154959
zfs:0:vdev_cache_stats:snaptime 451955.55419545
Performance when coping 236 GB of files (each file is 5537792 bytes,
with 20001 files per directory) from one directory to another:
Copy Method Data Rate
==================================== ==================
cpio -pdum 75 MB/s
cp -r 32 MB/s
tar -cf - . | (cd dest && tar -xf -) 26 MB/s
I would expect data copy rates approaching 200 MB/s.
you might want to dtrace this to break down where the latency is
occuring .. eg: is this a DNLC caching problem, ARC problem, or device
level problem
also - is this really coming off a 2540? if so - you should probably
investigate the array throughput numbers and what's happening on the
RAID controller .. i typically find it helpful to understand what the
raw hardware is capable of (hence tools like vdbench to drive an
anticipated load before i configure anything) - and then attempting to
configure the various tunables to match after that
for now you're pretty much just at the FS/VOP layers and playing with
caching when the real culprit might be more on the vdev interface
layer or below
---
.je
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss