Working on a POC for high IO workloads, and I’m running in to a bottleneck that I’m not sure I can solve. Testbed looks like this :

SuperMicro 6026-6RFT+ barebones w/ dual 5506 CPU’s, 72GB RAM, and ESXi
VM – 4GB RAM, 1vCPU
Connectivity dual 10Gbit Ethernet to Cisco Nexus 5010

Target Nexenta system :

Intel barebones, Dual Xeon 5620 CPU’s, 192GB RAM, Nexenta 3.1.3 Enterprise Intel x520 dual port 10Gbit Ethernet – LACP Active VPC to Nexus 5010 switches.
2x LSI 9201-16E HBA’s, 1x LSI 9200-8e HBA
5 DAE’s (3 in use for this test)
1 DAE – connected (multipathed) to LSI 9200-8e. Loaded w/ 6x Stec ZeusRAM SSD’s – striped for ZIL, and 6x OCZ Talos C 230GB drives for L2ARC. 2 DAE’s connected (multipathed) to one LSI 9201-16E – 24x 600GB 15k Seagate Cheetah drives
Obviously data integrity is not guaranteed

Testing using IOMeter from windows guest, 10GB test file, queue depth of 64 I have a share set up with 4k recordsizes, compression disabled, access time disabled, and am seeing performance as follows :

~50,000 IOPS 4k random read. 200MB/sec, 30% CPU utilization on Nexenta, ~90% utilization on guest OS. I’m guessing guest OS is bottlenecking. Going to try physical hardware next week ~25,000 IOPS 4k random write. 100MB/sec, ~70% CPU utilization on Nexenta, ~45% CPU utilization on guest OS. Feels like Nexenta CPU is bottleneck. Load average of 2.5

A quick test with 128k recordsizes and 128k IO looked to be 400MB/sec performance, can’t remember CPU utilization on either side. Will retest and report those numbers.

It feels like something is adding more overhead here than I would expect on the 4k recordsizes/IO workloads. Any thoughts where I should start on this? I’d really like to see closer to 10Gbit performance here, but it seems like the hardware isn’t able to cope with it?





_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to