Re: [zfs-discuss] Why is Solaris 10 ZFS performance so terrible?

Jonathan Edwards Sat, 04 Jul 2009 05:51:24 -0700


On Jul 4, 2009, at 12:03 AM, Bob Friesenhahn wrote:

% ./diskqual.sh
c1t0d0 130 MB/sec
c1t1d0 130 MB/sec
c2t202400A0B83A8A0Bd31 13422 MB/sec
c3t202500A0B83A8A0Bd31 13422 MB/sec
c4t600A0B80003A8A0B0000096A47B4559Ed0 191 MB/sec
c4t600A0B80003A8A0B0000096E47B456DAd0 192 MB/sec
c4t600A0B80003A8A0B0000096147B451BEd0 192 MB/sec
c4t600A0B80003A8A0B0000096647B453CEd0 192 MB/sec
c4t600A0B80003A8A0B0000097347B457D4d0 212 MB/sec
c4t600A0B800039C9B500000A9C47B4522Dd0 191 MB/sec
c4t600A0B800039C9B500000AA047B4529Bd0 192 MB/sec
c4t600A0B800039C9B500000AA447B4544Fd0 192 MB/sec
c4t600A0B800039C9B500000AA847B45605d0 191 MB/sec
c4t600A0B800039C9B500000AAC47B45739d0 191 MB/sec
c4t600A0B800039C9B500000AB047B457ADd0 191 MB/sec
c4t600A0B800039C9B500000AB447B4595Fd0 191 MB/sec

somehow i don't think that reading the first 64MB off (presumably) offa raw disk device 3 times and picking the middle value is going togive you much useful information on the overall state of the disks ..i believe this was more of a quick hack to just validate that there'snothing too far out of the norm, but with that said - what's the c2and c3 device above? you've got to be caching the heck out of that toget that unbelievable 13 GB/s - so you're really only seeing memoryspeeds there

more useful information would be something more like the old taz orsome of the disk IO latency tools when you're driving a workload.

% arc_summary.pl

System Memory:
         Physical RAM:  20470 MB
         Free Memory :  2371 MB
         LotsFree:      312 MB

ZFS Tunables (/etc/system):
         * set zfs:zfs_arc_max = 0x300000000
         set zfs:zfs_arc_max = 0x280000000
         * set zfs:zfs_arc_max = 0x200000000

ARC Size:
         Current Size:             9383 MB (arcsize)
         Target Size (Adaptive):   10240 MB (c)
         Min Size (Hard Limit):    1280 MB (zfs_arc_min)
         Max Size (Hard Limit):    10240 MB (zfs_arc_max)

ARC Size Breakdown:
         Most Recently Used Cache Size:           6%    644 MB (p)
         Most Frequently Used Cache Size:        93%    9595 MB (c-p)

ARC Efficency:
         Cache Access Total:             674638362
         Cache Hit Ratio:      91%       615586988      [Defined State for 
buffer]
         Cache Miss Ratio:      8%       59051374       [Undefined State for 
Buffer]
         REAL Hit Ratio:       87%       590314508      [MRU/MFU Hits Only]

         Data Demand   Efficiency:    96%
         Data Prefetch Efficiency:     7%

        CACHE HITS BY CACHE LIST:
Anon: 2% 13626529 [ NewCustomer, First Cache Hit ]Most Recently Used: 78% 480379752 (mru) [ ReturnCustomer ]Most Frequently Used: 17% 109934756 (mfu)[ Frequent Customer ]Most Recently Used Ghost: 0% 5180256 (mru_ghost) [ ReturnCustomer Evicted, Now Back ]Most Frequently Used Ghost: 1% 6465695 (mfu_ghost) [ FrequentCustomer Evicted, Now Back ]
        CACHE HITS BY DATA TYPE:
          Demand Data:                78%        485431759
          Prefetch Data:               0%        3045442
          Demand Metadata:            16%        103900170
          Prefetch Metadata:           3%        23209617
        CACHE MISSES BY DATA TYPE:
          Demand Data:                30%        18109355
          Prefetch Data:              60%        35633374
          Demand Metadata:             6%        3806177
Prefetch Metadata: 2% 1502468---------------------------------------------
Prefetch seems to be performing badly. The Ben Rockwood's blogentry at http://www.cuddletech.com/blog/pivot/entry.php?id=1040discusses prefetch. The sample Dtrace script on that page onlyshows cache misses:
vdev_cache_read: 6507827833451031357 read 131072 bytes at offset6774849536: MISSvdev_cache_read: 6507827833451031357 read 131072 bytes at offset6774980608: MISS
Unfortunately, the file-level prefetch DTrace sample script from thesame page seems to have a syntax error.

if you're using LUNs off an array - this might be another case of thezfs_vdev_max_pending being tuned more for direct attach drives .. youcould be trying to queue up too much I/O against the RAID controller,particularly if the RAID controller is also trying to prefetch out ofit's cache.

I tried disabling file level prefetch (zfs_prefetch_disable=1) butdid not observe any change in behavior.

this is only going to help if you've got problems in zfetch .. you'dprobably see this better by looking for high lock contention in zfetchwith lockstat

# kstat -p zfs:0:vdev_cache_stats
zfs:0:vdev_cache_stats:class    misc
zfs:0:vdev_cache_stats:crtime   130.61298275
zfs:0:vdev_cache_stats:delegations      754287
zfs:0:vdev_cache_stats:hits     3973496
zfs:0:vdev_cache_stats:misses   2154959
zfs:0:vdev_cache_stats:snaptime 451955.55419545

Performance when coping 236 GB of files (each file is 5537792 bytes,with 20001 files per directory) from one directory to another:


Copy Method                             Data Rate
====================================    ==================
cpio -pdum                              75 MB/s
cp -r                                   32 MB/s
tar -cf - . | (cd dest && tar -xf -)    26 MB/s

I would expect data copy rates approaching 200 MB/s.

you might want to dtrace this to break down where the latency isoccuring .. eg: is this a DNLC caching problem, ARC problem, or devicelevel problem

also - is this really coming off a 2540? if so - you should probablyinvestigate the array throughput numbers and what's happening on theRAID controller .. i typically find it helpful to understand what theraw hardware is capable of (hence tools like vdbench to drive ananticipated load before i configure anything) - and then attempting toconfigure the various tunables to match after that

for now you're pretty much just at the FS/VOP layers and playing withcaching when the real culprit might be more on the vdev interfacelayer or below


---
.je
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Why is Solaris 10 ZFS performance so terrible?

Reply via email to