[zfs-discuss] ZFS and small block random I/O

Marcel Guerin Tue, 19 Feb 2008 11:34:03 -0800

Hi,


We're doing some benchmarking at a customer (using IOzone) and for some
specific small block random tests, performance of their X4500 is very
poor (~1.2 MB/s aggregate throughput for a 5+1 RAIDZ).  Specifically,
the test is the IOzone multithreaded throughput test of an 8GB file size
and 8KB record size, with the server physmem'd to 2GB.

 

I noticed a couple of peculiar anomalies when investigating the slow
results.  I am wondering if Sun has any best practices, tips for
optimizing small block random I/O on ZFS, or any other documents that
might explain what we're seeing and give us guidance on how to most
effectively deploy ZFS in an environment with heavy small block random
I/O.

 

The first anomaly, Brendan Gregg's CacheKit Perl script fcachestat shows
the segmap cache is hardly used (occasionally during the IOzone random
read benchmark, while the disks are grabbing 20MB/s in aggregate, the
segmap cache gets 100% hits for 1-3 attempts *every 10 seconds*--while
all other samples are zero% for zero attempts.  I don't know the kernel
I/O path as well as I'd like, but I tried to see requests for ZFS to
grab a file/offset block from disk by DTracing fbt::zfs_getpage
(assuming it was the ZFS equivalent of ufs_getpage) and got no hits as
well.  In other words, it's as if ZFS isn't using the segmap cache.

 

Secondly, DTrace scripts show the IOzone application is reading 8KB
blocks, but by the time the physical I/O happens it's ballooned into a
26KB read operation for each disk.  In other words, a single 8KB read
generates 156KB of actual disk reads.  We tried changing the ZFS recsize
parameter from 128KB down to 8KB (recreated the ZPool and ZFS file
system and changing recsize before creating the file), and that made the
performance even worse-which has thrown us for a loop.

 

I appreciate any assistance or direction you might be able to provide!

Thanks!
Marcel

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

[zfs-discuss] ZFS and small block random I/O

Reply via email to