2011-05-22 20:39, Richard Elling wrote:
This means that the target closed the connection because there was already a task in progress. Likely this was the retry after the timeout. By default, these timeouts are quite long, so by now performance is already terrible. I'm not sure if you can make the iscsi target any slower. I've seen some well-known-manufacturer's iscsi arrays take 2-3 minutes to respond, in a repeatable manner. The best advice is to fix it or replace it. I don't think there is anything you can do on the initiator side to get to a state of happiness. -- richard
Well, thanks for the advice. As my readers may know ;) I am now in the process of evacuating data from this lagging "dcpool" into the physical pool which contains the volume and serves iSCSI. I am surprised that playing with "failmode" achieved nothing - I hoped that the "dcpool" would continue working where it hang at the moment of loss of iSCSI device as soon as it is "found" again... Or do the errors imply that it was not found after all? So since I can't seem to "fix" the ZFS/iSCSI server performance, I'll continue along the path of "replacing" it with native filesystem datasets in the parent pool. The experiment taught me a lot, but the practical end result is not enjoyable for me however - with such lags and overheads for ZFS volume storage and most of all unreliable performance. Next time I'll think twice if iSCSI (at all and via ZFS in particular) is an arguably better solution than large files over NFS (for VMs or whatever), and/or will plan better and try to stress-test first. In particular this should and would mean nearly filling up the pool with several TBs of (any) data... and such stress-testing is about what I did now before spec'ing a server for work and defining what it should and should not do in order to perform :) Maybe it is just my computer (Dual-core 2.8GHz P4 with 8Gb RAM) - but if such specs repeatably fail at running a single volume, then I'm at a loss to predict and construct machines for good stable ZFS/iSCSI performance machines ;\ Again, part of the lags is probably contributed to by dedup inside this "dcpool", and by using 4Kb data blocks along with 4Kb metadata blocks for its container volume "pool/dcpool" - which probably lead to high processing overheads (and certainly did lead to 2x storage consumption), and maybe to high fragmentation. -- +============================================================+ | | | Климов Евгений, Jim Klimov | | технический директор CTO | | ЗАО "ЦОС и ВТ" JSC COS&HT | | | | +7-903-7705859 (cellular) mailto:jimkli...@cos.ru | | CC:ad...@cos.ru,jimkli...@mail.ru | +============================================================+ | () ascii ribbon campaign - against html mail | | /\ - against microsoft attachments | +============================================================+ _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss