There has been no forward progress on the ZFS read performance issue
for a week now. A 4X reduction in file read performance due to having
read the file before is terrible, and of course the situation is
considerably worse if the file was previously mmapped as well. Many
of us have sent a lot of money to Sun and were not aware that ZFS is
sucking the life out of our expensive Sun hardware.
It is trivially easy to reproduce this problem on multiple machines.
For example, I reproduced it on my Blade 2500 (SPARC) which uses a
simple mirrored rpool. On that system there is a 1.8X read slowdown
from the file being accessed previously.
In order to raise visibility of this issue, I invite others to see if
they can reproduce it in their ZFS pools. The script at
http://www.simplesystems.org/users/bfriesen/zfs-discuss/zfs-cache-test.ksh
Implements a simple test. It requires a fair amount of disk space to
run, but the main requirement is that the disk space consumed be more
than available memory so that file data gets purged from the ARC.
The script needs to run as root since it creates a filesystem and uses
mount/umount. The script does not destroy any data.
There are several adjustments which may be made at the front of the
script. The pool 'rpool' is used by default, but the name of the pool
to test may be supplied via an argument similar to:
# ./zfs-cache-test.ksh Sun_2540
zfs create Sun_2540/zfscachetest
Creating data file set (3000 files of 8192000 bytes) under
/Sun_2540/zfscachetest ...
Done!
zfs unmount Sun_2540/zfscachetest
zfs mount Sun_2540/zfscachetest
Doing initial (unmount/mount) 'cpio -o > /dev/null'
48000247 blocks
real 2m54.17s
user 0m7.65s
sys 0m36.59s
Doing second 'cpio -o > /dev/null'
48000247 blocks
real 11m54.65s
user 0m7.70s
sys 0m35.06s
Feel free to clean up with 'zfs destroy Sun_2540/zfscachetest'.
And here is a similar run on my Blade 2500 using the default rpool:
# ./zfs-cache-test.ksh
zfs create rpool/zfscachetest
Creating data file set (3000 files of 8192000 bytes) under
/rpool/zfscachetest ...
Done!
zfs unmount rpool/zfscachetest
zfs mount rpool/zfscachetest
Doing initial (unmount/mount) 'cpio -o > /dev/null'
48000247 blocks
real 13m3.91s
user 2m43.04s
sys 9m28.73s
Doing second 'cpio -o > /dev/null'
48000247 blocks
real 23m50.27s
user 2m41.81s
sys 9m46.76s
Feel free to clean up with 'zfs destroy rpool/zfscachetest'.
I am interested to hear about systems which do not suffer from this
bug.
Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer, http://www.GraphicsMagick.org/
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss