Working on a POC for high IO workloads, and I’m running in to a
bottleneck that I’m not sure I can solve. Testbed looks like this :
SuperMicro 6026-6RFT+ barebones w/ dual 5506 CPU’s, 72GB RAM, and ESXi
VM – 4GB RAM, 1vCPU
Connectivity dual 10Gbit Ethernet to Cisco Nexus 5010
Target Nexenta system :
Intel barebones, Dual Xeon 5620 CPU’s, 192GB RAM, Nexenta 3.1.3
Enterprise
Intel x520 dual port 10Gbit Ethernet – LACP Active VPC to Nexus 5010
switches.
2x LSI 9201-16E HBA’s, 1x LSI 9200-8e HBA
5 DAE’s (3 in use for this test)
1 DAE – connected (multipathed) to LSI 9200-8e. Loaded w/ 6x Stec
ZeusRAM SSD’s – striped for ZIL, and 6x OCZ Talos C 230GB drives for
L2ARC.
2 DAE’s connected (multipathed) to one LSI 9201-16E – 24x 600GB 15k
Seagate Cheetah drives
Obviously data integrity is not guaranteed
Testing using IOMeter from windows guest, 10GB test file, queue depth
of 64
I have a share set up with 4k recordsizes, compression disabled, access
time disabled, and am seeing performance as follows :
~50,000 IOPS 4k random read. 200MB/sec, 30% CPU utilization on
Nexenta, ~90% utilization on guest OS. I’m guessing guest OS is
bottlenecking. Going to try physical hardware next week
~25,000 IOPS 4k random write. 100MB/sec, ~70% CPU utilization on
Nexenta, ~45% CPU utilization on guest OS. Feels like Nexenta CPU is
bottleneck. Load average of 2.5
A quick test with 128k recordsizes and 128k IO looked to be 400MB/sec
performance, can’t remember CPU utilization on either side. Will retest
and report those numbers.
It feels like something is adding more overhead here than I would
expect on the 4k recordsizes/IO workloads. Any thoughts where I should
start on this? I’d really like to see closer to 10Gbit performance
here, but it seems like the hardware isn’t able to cope with it?
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss