Hi Paul,

I am using EXSi 4.0 with a NFS-on-ZFS datastore running on OSOL b134. It 
previously ran on Solaris 10u7 with VMware Server 2.x. Disks are SATAs in a 
JBOD over FC.

I'll try to summarize my experience here, albeit our system does not provide 
services to end users and thus is not very stressed (it supports our internal 
developers only).

There are some things to know before setting up a ZFS storage, in my opinion. 
1. You cannot shrink a zpool. You can only detach disks from mirrors, not from 
raidz. You can grow it, tho, by either replacing disks with higher capacity 
ones or by adding more disks to the same pool.

2. Due to ZFS inherently coherent structure, synch writes (especially random) 
are its worst enemy: the cure is to bring a pair of mirrored SSDs in the pool 
or to use a battery backed write cache, especially if you want to use raidz.
3. VMware, regardless of NFS or iSCSI, will do synchronous writes. Due to point 
2 above if your workload and number of VM is significant you will definitely 
need some kind of disk device based on memory and not on platters. YMMV.

4. ZFS snapshotting is great but it can burn a sizeable amount of disk if you 
leave your VMs local disks mounted without noatime option (assuming they are 
unix machines) because vmdks will get written to even if the internal vm 
processes only issue reads on files. (in my case ~60 idle linux machines burned 
up to 200MB/hr, generating an average of 2MB/sec of writing traffic)

5. Avoid putting different size and performance disks in a zpool, unless of 
course they are doing a different job. ZFS doesn't weigh in size or 
performances and spreads out data evenly. A zpool will perform as the slowest 
of its members (not all the time, but when the workload is high the slowest 
disk will be a limiting factor.
6. ZFS performs badly on disks that are more than 80% full. Keep that in mind 
when you size up for things.

7. ZFS compression works wonders, especially the default one: it costs little 
in cpu, it doesn't increase latency (unless you have a very unbalanced 
CPU/Disks system) and thus saves space and bandwidth.
8. By mirroring/raiding things at OS level ZFS effectively multiplies the 
bandwidth used on the bus. My SATA disks can sustain writing 60MB/sec, but in a 
zpool made by 6 mirrors of 2 disks each that uses a 2Gbit fibre the maximum 
throughtput is ~95MB/sec: the fibre max out at 190MB/sec, but Solaris need to 
write to both disks on each mirror. You can partially solve this by putting 
each side of a mirror on different storages and/or increasing the number of 
paths towards the disks.

9. Deduplication is great on paper and can be wonderful in virtualized 
environments, but it has a BIG costs upfront: search around, do you math but be 
aware that you'll need tons of ram and SSDs to be able to effectively 
deduplicate multi terabyte storages. Also it is common opinion that's not ready 
for production.

If the analisys/tests of your use case tells you ZFS is a viable option I think 
you should give it a try. Administration is wonderfully flexible and easy: once 
you set up your zpools (which is the really critical phase) you can practically 
do anything you want in the most efficient way. So far I'm very pleased by it.

-- 
Simone Caldana
Senior Consultant
Critical Path
via Cuniberti 58, 10100 Torino, Italia
+39 011 4513811 (Direct)
+39 011 4513825 (Fax)
simone.cald...@criticalpath.net
http://www.cp.net/

Critical Path
A global leader in digital communications


_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to