After reading many-many threads on ZFS performance today (top of the list in the forum, and some chains of references), I applied a bit of tuning to the server.
In particular, I've set the zfs_write_limit_override to 384Mb so my cache is spooled to disks more frequently (if streaming lots of writes) and in smaller increments. * echo zfs_write_limit_override/W0t402653184 | mdb -kw set zfs:zfs_write_limit_override = 0x18000000 The system seems to be working more smoothly (vs. jerky), and "zpool iostat" values are not quite as jumpy (i.e. 320MBps to 360MBps for a certain test). The results also seem faster and more consistent. With this tuning applied, I'm writing to a 40G zvol, 1M records (count=1048576) of: 4k (bs=4096): 17s (12s), 241MBps 8k (bs=8192): 29s (18s), 282MBps 16k (bs=16384): 54s (30s), 303MBps 32k (bs=32768): 113s (56s), 290MBps 64k (bs=65536): 269s (104s), 243MBps And 10240 larger records of: 1 MB (bs=1048576): 33s (8s), 310MBps 2 MB (bs=2097152): 74s (23s), 276MBps And 1024 yet larger records: 1 MB (bs=1048576): 4s (1s), 256MBps 4 MB (bs=4194304): 12s (5s), 341MBps 16MB (bs=16777216): 71s (18s), 230MBps 32MB (bs=33554432): 150s (36s), 218MBps So the zvol picture is quite better now (albeit not perfect - i.e. no values are near the 1GBps noted previously in "zpool iostat"), for both small and large blocks. For filesystem dataset the new values are very similar (like, to tenths of a second on smaller blocksizes!) but as the blocksize grows, filesystems start losing to the zvols. Overall the result seems lower than achieved before I tried tuning. 1M records (count=1048576) of: 4k (bs=4096): 17s (12s), 241MBps 8k (bs=8192): 29s (18s), 282MBps 16k (bs=16384): 67s (30s), 245MBps 32k (bs=32768): 144s (55s), 228MBps 64k (bs=65536): 275s (98s), 238MBps And 10240 larger records go better: 1 MB (bs=1048576): 33s (9s), 310MBps 2 MB (bs=2097152): 70s (21s), 292MBps And 1024 yet larger records: 1 MB (bs=1048576): 2.8s (0.8s), 366MBps 4 MB (bs=4194304): 12s (4s), 341MBps 16MB (bs=16777216): 55s (17s), 298MBps 32MB (bs=33554432): 140s (36s), 234MBps Occasionally I did reruns; user time for the same setups can vary significantly (like 65s vs 84s) while the system time stays pretty much the same. "zpool iostat" shows larger values (like 320MBps typically) but I think that can be attributed to writing parity stripes on raidz vdevs. //Jim PS: for completeness, I'll try smaller blocks without tuning in a future post. -- This message posted from opensolaris.org _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss