> if those servers are on physical boxes right now i'd do some perfmon > caps and add up the iops.
Using perfmon to get a sense of what is required is a good idea. Use the 95 percentile to be conservative. The counters I have used are in the Physical disk object. Don't ignore the latency counters either. In my book, anything consistently over 20ms or so is excessive. I run 30+ VMs on an Equallogic array with 14 sata disks, broken up as two striped 6 disk raid5 sets (raid 50) with 2 hot spares. That array is, on average, about 25% loaded from an IO stand point. Obviously my VMs are pretty light. And the EQL gear is *fast*, which makes me feel better about spending all of that money :). >> Regarding ZIL usage, from what I have read you will only see >> benefits if you are using NFS backed storage, but that it can be >> significant. > > link? >From the ZFS Evil Tuning Guide >(http://www.solarisinternals.com/wiki/index.php/ZFS_Evil_Tuning_Guide): "ZIL stands for ZFS Intent Log. It is used during synchronous writes operations." further down: "If you've noticed terrible NFS or database performance on SAN storage array, the problem is not with ZFS, but with the way the disk drivers interact with the storage devices. ZFS is designed to work with storage devices that manage a disk-level cache. ZFS commonly asks the storage device to ensure that data is safely placed on stable storage by requesting a cache flush. For JBOD storage, this works as designed and without problems. For many NVRAM-based storage arrays, a problem might come up if the array takes the cache flush request and actually does something rather than ignoring it. Some storage will flush their caches despite the fact that the NVRAM protection makes those caches as good as stable storage. ZFS issues infrequent flushes (every 5 second or so) after the uberblock updates. The problem here is fairly inconsequential. No tuning is warranted here. ZFS also issues a flush every time an application requests a synchronous write (O_DSYNC, fsync, NFS commit, and so on). The completion of this type of flush is waited upon by the application and impacts performance. Greatly so, in fact. From a performance standpoint, this neutralizes the benefits of having an NVRAM-based storage." When I was testing iSCSI vs. NFS, it was clear iSCSI was not doing sync, NFS was. Here are some zpool iostat numbers: iSCSI testing using iometer with the RealLife work load (65% read, 60% random, 8k transfers - see the link in my previous post) - it is clear that writes are being cached in RAM, and then spun off to disk. # zpool iostat data01 1 capacity operations bandwidth pool used avail read write read write ---------- ----- ----- ----- ----- ----- ----- data01 55.5G 20.4T 691 0 4.21M 0 data01 55.5G 20.4T 632 0 3.80M 0 data01 55.5G 20.4T 657 0 3.93M 0 data01 55.5G 20.4T 669 0 4.12M 0 data01 55.5G 20.4T 689 0 4.09M 0 data01 55.5G 20.4T 488 1.77K 2.94M 9.56M data01 55.5G 20.4T 29 4.28K 176K 23.5M data01 55.5G 20.4T 25 4.26K 165K 23.7M data01 55.5G 20.4T 20 3.97K 133K 22.0M data01 55.6G 20.4T 170 2.26K 1.01M 11.8M data01 55.6G 20.4T 678 0 4.05M 0 data01 55.6G 20.4T 625 0 3.74M 0 data01 55.6G 20.4T 685 0 4.17M 0 data01 55.6G 20.4T 690 0 4.04M 0 data01 55.6G 20.4T 679 0 4.02M 0 data01 55.6G 20.4T 664 0 4.03M 0 data01 55.6G 20.4T 699 0 4.27M 0 data01 55.6G 20.4T 423 1.73K 2.66M 9.32M data01 55.6G 20.4T 26 3.97K 151K 21.8M data01 55.6G 20.4T 34 4.23K 223K 23.2M data01 55.6G 20.4T 13 4.37K 87.1K 23.9M data01 55.6G 20.4T 21 3.33K 136K 18.6M data01 55.6G 20.4T 468 496 2.89M 1.82M data01 55.6G 20.4T 687 0 4.13M 0 Testing against NFS shows writes to disk continuously. NFS Testing capacity operations bandwidth pool used avail read write read write ---------- ----- ----- ----- ----- ----- ----- data01 59.6G 20.4T 57 216 352K 1.74M data01 59.6G 20.4T 41 21 660K 2.74M data01 59.6G 20.4T 44 24 655K 3.09M data01 59.6G 20.4T 41 23 598K 2.97M data01 59.6G 20.4T 34 33 552K 4.21M data01 59.6G 20.4T 46 24 757K 3.09M data01 59.6G 20.4T 39 24 593K 3.09M data01 59.6G 20.4T 45 25 687K 3.22M data01 59.6G 20.4T 45 23 683K 2.97M data01 59.6G 20.4T 33 23 492K 2.97M data01 59.6G 20.4T 16 41 214K 1.71M data01 59.6G 20.4T 3 2.36K 53.4K 30.4M data01 59.6G 20.4T 1 2.23K 20.3K 29.2M data01 59.6G 20.4T 0 2.24K 30.2K 28.9M data01 59.6G 20.4T 0 1.93K 30.2K 25.1M data01 59.6G 20.4T 0 2.22K 0 28.4M data01 59.7G 20.4T 21 295 317K 4.48M data01 59.7G 20.4T 32 12 495K 1.61M data01 59.7G 20.4T 35 25 515K 3.22M data01 59.7G 20.4T 36 11 522K 1.49M data01 59.7G 20.4T 33 24 508K 3.09M data01 59.7G 20.4T 35 23 536K 2.97M data01 59.7G 20.4T 32 23 483K 2.97M data01 59.7G 20.4T 37 37 538K 4.70M Note, the ZIL is being used, just not on a separate device. The periodic high writes show it being flushed. You can also see reads stall to nearly zero as the ZIL is dumping. Not good. This thread is discussing this behavior: http://www.opensolaris.org/jive/thread.jspa?threadID=106453 Coming from a mostly Windows world, I really like the tools that you get on Opensolaris to see this kind of stuff. -Scott -- This message posted from opensolaris.org _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss