Re: [zfs-discuss] Pool iscsi /zfs performance in opensolaris 0906

erik.ableson Tue, 04 Aug 2009 07:41:21 -0700

You're running into the same problem I had with 2009.06 as they have"corrected" a bug where the iSCSI target prior to 2009.06 didn't honorcompletely SCSI sync commands issued by the initiator.


Some background :


Discussion:
http://opensolaris.org/jive/thread.jspa?messageID=388492

"corrected bug"
http://bugs.opensolaris.org/view_bug.do?bug_id=6770534

The upshot is that unless you have an SSD (or other high speeddedicated device) attached as a ZIL (or slog) on 2009.06 you won't seeanywhere near the local speed performance that the storage is capableof since you're forcing individual transactions all the way down todisk and back up before moving onto the next SCSI block command.

This iSCSI performance profile is currently specific to 2009.06 anddoes not occur on 2008.11. As a stopgap (since I don't have a budgetfor SSDs right now) I'm keeping my production servers on 2008.11(taking into account the additional potential risk, but these aremachines with battery backed SAS cards in a conditioned data center).These machines are serving up iSCSI to ESX 3.5 and ESX 4 servers.

For my freewheeling home use where everything gets tried, crashed,patched and put back together with baling twine (and is backed upelsewhere...) I've mounted a RAM disk of 1Gb which is attached to thepool as a ZIL and you see the performance run in cycles where the ZILloads up to saturation, flushes out to disk and keeps going. I didwrite a script to regularly dd the ram disk device out to a file sothat I can recreate with the appropriate signature if I have to rebootthe osol box. This is used with the GlobalSAN initiator on OS X aswell as various Windows and Linux machines, physical and VM.

Assuming this is a test system that you're playing with and you candestroy the pool with inpunity, and you don't have an SSD lying aroundto test with, try the following :

ramdiskadm -a slog 2g (or whatever size you can manage reasonably withthe available physical RAM - try "vmstat 1 2" to determine availablememory)

zpool add <poolname> log /dev/ramdisk/slog

If you want to perhaps reuse the slog later (ram disks are notpreserved over reboot) write the slog volume out to disk and dump itback in after restarting.

 dd if=/dev/ramdisk/slog of=/root/slog.dd

All of the above assumes that you are not doing this stuff againstrpool. I think that attaching a volatile log device to your boot poolwould result in a machine that can't mount the root zfs volume.

It's easiest to monitor from the Mac (I find) so try your test againwith the Activity Monitor showing network traffic and you'll see thatit goes to a wire speed ceiling while it's filling up the ZIL and onceit's saturated your traffic will drop to near nothing, and then pickup again after a few seconds. If you don't saturate the ZIL you'll seecontinuous speed data transfer.


Cheers,

Erik

On 4 août 09, at 15:57, Charles Baker wrote:

My testing has shown some serious problems with the
iSCSI implementation for OpenSolaris.

I setup a VMware vSphere 4 box with RAID 10
direct-attached storage and 3 virtual machines:
- OpenSolaris 2009.06 (snv_111b) running 64-bit
- CentOS 5.3 x64 (ran yum update)
- Ubuntu Server 9.04 x64 (ran apt-get upgrade)

I gave each virtual 2 GB of RAM, a 32 GB drive and
setup a 16 GB iSCSI target on each (the two Linux vms
used iSCSI Enterprise Target 0.4.16 with blockio).
VMware Tools was installed on each. No tuning was
done on any of the operating systems.

I ran two tests for write performance - one one the
server itself and one from my Mac connected via
Gigabit (mtu of 1500) iSCSI connection using
globalSAN’s latest initiator.

Here’s what I used on the servers:
time dd if=/dev/zero of=/root/testfile bs=1048576k
count=4
and the Mac OS with the iSCSI connected drive
(formatted with GPT / Mac OS Extended journaled):
time dd if=/dev/zero of=/Volumes/test/testfile
bs=1048576k count=4

The results were very interesting (all calculations
using 1 MB = 1,084,756 bytes)

For OpenSolaris, the local write performance averaged
86 MB/s. I turned on lzjb compression for rpool (zfs
set compression=lzjb rpool) and it went up to 414
MB/s since I’m writing zeros). The average
performance via iSCSI was an abysmal 16 MB/s (even
with compression turned on - with it off, 13 MB/s).

For CentOS (ext3), local write performance averaged
141 MB/s. iSCSI performance was 78 MB/s (almost as
fast as local ZFS performance on the OpenSolaris
server when compression was turned off).

Ubuntu Server (ext4) had 150 MB/s for the local
write. iSCSI performance averaged 80 MB/s.

One of the main differences between the three virtual
machines was that the iSCSI target on the Linux
machines used partitions with no file system. On
OpenSolaris, the iSCSI target created sits on top of
ZFS. That creates a lot of overhead (although you do
get some great features).

Since all the virtual machines were connected to the
same switch (with the same MTU), had the same amount
of RAM, used default configurations for the operating
systems, and sat on the same RAID 10 storage, I’d say
it was a pretty level playing field.

While jumbo frames will help iSCSI performance, it
won’t overcome inherit limitations of the iSCSI
target’s implementation.


cross-posting with zfs discuss.
--
This message posted from opensolaris.org
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Pool iscsi /zfs performance in opensolaris 0906

Reply via email to