Re: [zfs-discuss] iSCSI pool timeouts during high latency moments

Jim Klimov Sun, 22 May 2011 10:39:16 -0700

2011-05-22 20:39, Richard Elling wrote:


 This means that the target closed the connection because there was
 already a task in progress.
 Likely this was the retry after the timeout. By default, these
 timeouts are quite long, so by now
 performance is already terrible.

 I'm not sure if you can make the iscsi target any slower. I've seen
 some well-known-manufacturer's
 iscsi arrays take 2-3 minutes to respond, in a repeatable manner. The
 best advice is to fix it or
 replace it. I don't think there is anything you can do on the
 initiator side to get to a state of
 happiness.
 -- richard


Well, thanks for the advice. As my readers may know ;) I am now
in the process of evacuating data from this lagging "dcpool" into
the physical pool which contains the volume and serves iSCSI.

I am surprised that playing with "failmode" achieved nothing -
I hoped that the "dcpool" would continue working where it hang
at the moment of loss of iSCSI device as soon as it is "found"
again... Or do the errors imply that it was not found after all?

So since I can't seem to "fix" the ZFS/iSCSI server performance,
I'll continue along the path of "replacing" it with native filesystem
datasets in the parent pool.

The experiment taught me a lot, but the practical end result is
not enjoyable for me however - with such lags and overheads
for ZFS volume storage and most of all unreliable performance.
Next time I'll think twice if iSCSI (at all and via ZFS in particular)
is an arguably better solution than large files over NFS (for VMs
or whatever), and/or will plan better and try to stress-test first.
In particular this should and would mean nearly filling up the
pool with several TBs of (any) data... and such stress-testing
is about what I did now before spec'ing a server for work and
defining what it should and should not do in order to perform :)

Maybe it is just my computer (Dual-core 2.8GHz P4 with 8Gb RAM) -
but if such specs repeatably fail at running a single volume, then I'm
at a loss to predict and construct machines for good stable
ZFS/iSCSI performance machines ;\

Again, part of the lags is probably contributed to by dedup inside
this "dcpool", and by using 4Kb data blocks along with 4Kb metadata
blocks for its container volume "pool/dcpool" - which probably lead
to high processing overheads (and certainly did lead to 2x storage
consumption), and maybe to high fragmentation.

--


+============================================================+
|                                                            |
| Климов Евгений,                                 Jim Klimov |
| технический директор                                   CTO |
| ЗАО "ЦОС и ВТ"                                  JSC COS&HT |
|                                                            |
| +7-903-7705859 (cellular)          mailto:jimkli...@cos.ru |
|                          CC:ad...@cos.ru,jimkli...@mail.ru |
+============================================================+
| ()  ascii ribbon campaign - against html mail              |
| /\                        - against microsoft attachments  |
+============================================================+




_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] iSCSI pool timeouts during high latency moments

Reply via email to