I am still trying to determine why Solaris 10 (Generic_141415-03) ZFS
performs so terribly on my system. I blew a good bit of personal life
savings on this set-up but am not seeing performance anywhere near
what is expected. Testing with iozone shows that bulk I/O performance
is good. Testing with Jeff Bonwick's 'diskqual.sh' shows expected
disk performance. The problem is that actual observed application
performance sucks, and could often be satisified by portable USB
drives rather than high-end SAS drives. It could be satisified by
just one SAS disk drive. Behavior is as if zfs is very slow to read
data since disks are read at only 2 or 3 MB/second followed by an
intermittent write on a long cycle. Drive lights blink slowly. It is
as if ZFS does no successful sequential read-ahead on the files (see
Prefetch Data hit rate of 0% and Prefetch Data cache miss of 60%
below), or there is a semaphore bottleneck somewhere (but CPU use is
very low).
Observed behavior is very program dependent.
# zpool status Sun_2540
pool: Sun_2540
state: ONLINE
status: The pool is formatted using an older on-disk format. The pool can
still be used, but some features are unavailable.
action: Upgrade the pool using 'zpool upgrade'. Once this is done, the
pool will no longer be accessible on older software versions.
scrub: scrub completed after 0h46m with 0 errors on Mon Jun 29 05:06:33 2009
config:
NAME STATE READ WRITE CKSUM
Sun_2540 ONLINE 0 0 0
mirror ONLINE 0 0 0
c4t600A0B80003A8A0B0000096A47B4559Ed0 ONLINE 0 0 0
c4t600A0B800039C9B500000AA047B4529Bd0 ONLINE 0 0 0
mirror ONLINE 0 0 0
c4t600A0B80003A8A0B0000096E47B456DAd0 ONLINE 0 0 0
c4t600A0B800039C9B500000AA447B4544Fd0 ONLINE 0 0 0
mirror ONLINE 0 0 0
c4t600A0B80003A8A0B0000096147B451BEd0 ONLINE 0 0 0
c4t600A0B800039C9B500000AA847B45605d0 ONLINE 0 0 0
mirror ONLINE 0 0 0
c4t600A0B80003A8A0B0000096647B453CEd0 ONLINE 0 0 0
c4t600A0B800039C9B500000AAC47B45739d0 ONLINE 0 0 0
mirror ONLINE 0 0 0
c4t600A0B80003A8A0B0000097347B457D4d0 ONLINE 0 0 0
c4t600A0B800039C9B500000AB047B457ADd0 ONLINE 0 0 0
mirror ONLINE 0 0 0
c4t600A0B800039C9B500000A9C47B4522Dd0 ONLINE 0 0 0
c4t600A0B800039C9B500000AB447B4595Fd0 ONLINE 0 0 0
errors: No known data errors
% ./diskqual.sh
c1t0d0 130 MB/sec
c1t1d0 130 MB/sec
c2t202400A0B83A8A0Bd31 13422 MB/sec
c3t202500A0B83A8A0Bd31 13422 MB/sec
c4t600A0B80003A8A0B0000096A47B4559Ed0 191 MB/sec
c4t600A0B80003A8A0B0000096E47B456DAd0 192 MB/sec
c4t600A0B80003A8A0B0000096147B451BEd0 192 MB/sec
c4t600A0B80003A8A0B0000096647B453CEd0 192 MB/sec
c4t600A0B80003A8A0B0000097347B457D4d0 212 MB/sec
c4t600A0B800039C9B500000A9C47B4522Dd0 191 MB/sec
c4t600A0B800039C9B500000AA047B4529Bd0 192 MB/sec
c4t600A0B800039C9B500000AA447B4544Fd0 192 MB/sec
c4t600A0B800039C9B500000AA847B45605d0 191 MB/sec
c4t600A0B800039C9B500000AAC47B45739d0 191 MB/sec
c4t600A0B800039C9B500000AB047B457ADd0 191 MB/sec
c4t600A0B800039C9B500000AB447B4595Fd0 191 MB/sec
% arc_summary.pl
System Memory:
Physical RAM: 20470 MB
Free Memory : 2371 MB
LotsFree: 312 MB
ZFS Tunables (/etc/system):
* set zfs:zfs_arc_max = 0x300000000
set zfs:zfs_arc_max = 0x280000000
* set zfs:zfs_arc_max = 0x200000000
ARC Size:
Current Size: 9383 MB (arcsize)
Target Size (Adaptive): 10240 MB (c)
Min Size (Hard Limit): 1280 MB (zfs_arc_min)
Max Size (Hard Limit): 10240 MB (zfs_arc_max)
ARC Size Breakdown:
Most Recently Used Cache Size: 6% 644 MB (p)
Most Frequently Used Cache Size: 93% 9595 MB (c-p)
ARC Efficency:
Cache Access Total: 674638362
Cache Hit Ratio: 91% 615586988 [Defined State for
buffer]
Cache Miss Ratio: 8% 59051374 [Undefined State for
Buffer]
REAL Hit Ratio: 87% 590314508 [MRU/MFU Hits Only]
Data Demand Efficiency: 96%
Data Prefetch Efficiency: 7%
CACHE HITS BY CACHE LIST:
Anon: 2% 13626529 [ New
Customer, First Cache Hit ]
Most Recently Used: 78% 480379752 (mru) [
Return Customer ]
Most Frequently Used: 17% 109934756 (mfu) [
Frequent Customer ]
Most Recently Used Ghost: 0% 5180256 (mru_ghost) [
Return Customer Evicted, Now Back ]
Most Frequently Used Ghost: 1% 6465695 (mfu_ghost) [
Frequent Customer Evicted, Now Back ]
CACHE HITS BY DATA TYPE:
Demand Data: 78% 485431759
Prefetch Data: 0% 3045442
Demand Metadata: 16% 103900170
Prefetch Metadata: 3% 23209617
CACHE MISSES BY DATA TYPE:
Demand Data: 30% 18109355
Prefetch Data: 60% 35633374
Demand Metadata: 6% 3806177
Prefetch Metadata: 2% 1502468
---------------------------------------------
Prefetch seems to be performing badly. The Ben Rockwood's blog entry
at http://www.cuddletech.com/blog/pivot/entry.php?id=1040 discusses
prefetch. The sample Dtrace script on that page only shows cache
misses:
vdev_cache_read: 6507827833451031357 read 131072 bytes at offset 6774849536:
MISS
vdev_cache_read: 6507827833451031357 read 131072 bytes at offset 6774980608:
MISS
Unfortunately, the file-level prefetch DTrace sample script from the
same page seems to have a syntax error.
I tried disabling file level prefetch (zfs_prefetch_disable=1) but did
not observe any change in behavior.
# kstat -p zfs:0:vdev_cache_stats
zfs:0:vdev_cache_stats:class misc
zfs:0:vdev_cache_stats:crtime 130.61298275
zfs:0:vdev_cache_stats:delegations 754287
zfs:0:vdev_cache_stats:hits 3973496
zfs:0:vdev_cache_stats:misses 2154959
zfs:0:vdev_cache_stats:snaptime 451955.55419545
Performance when coping 236 GB of files (each file is 5537792 bytes,
with 20001 files per directory) from one directory to another:
Copy Method Data Rate
==================================== ==================
cpio -pdum 75 MB/s
cp -r 32 MB/s
tar -cf - . | (cd dest && tar -xf -) 26 MB/s
I would expect data copy rates approaching 200 MB/s.
I have not seen a peep from a zfs developer on this list for a month
or two. It would be useful if they would turn up to explain possible
causes for this level of performance. If I am encountering this
problem, then it is likely that many others are as well.
Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer, http://www.GraphicsMagick.org/
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss