I asked Tano to use the 'snoop' command to capture the Ethernet packets to a file, while he attempted VMware's 'VMotion'.
# snoop -d {device} -o {filename} tcp port 3260 This file was made available to me on Tano's web server. The file size was nearly 85 Mbytes, capturing over 100,000 packets. I have downloaded the capture file, and been looking at it with Ethereal and WireShark. I do not have a corresponding 'iscsisnoop.d' file, but from the pattern of activity that I see, I can well imagine that it would show the same pattern of that we saw from Eugene, which I reported on here: http://mail.opensolaris.org/pipermail/storage-discuss/2008-October/006444.html (So here I'm looking at what's happening at the lower TCP level, rather than at the iScsi level.) In the Ethernet capture file, I can see the pattern of bursts of writes from the initiator. The Target can accept so many of these, and then needs to slow things down by reducing the TCP window size. Eventually the target says the TCP Window size is zero, effectively asking the initiator to stop. Now to start with, the target only leaves the 'TCP ZeroWindow', in place for a fraction of a second. Then it opens things up again by sending a 'TCP Window Update', restoring the window to 65160 bytes, and transfer resumes. This is normal and expected. But eventually we get to a stage where the target sets the TCP 'ZeroWindow' and leaves it there for an extended period of time. I talking about seconds here. The initiator starts to send 'TCP ZeroWindowProbe' packets every 5 seconds. The target promptly responds with a 'TCP ZeroWindowProbeAck' packet. (Presumably, this is the initiator just confirming that the target is still alive.) This cycle of Probes & Ack's repeats for 50 seconds. During this period the target shows no sign of wanting to accept any more data. Then the initiator seems to decide it has had enough, and just cannot be bothered to wait any longer, and it [RST,ACK]'s the TCP session, and then starts a fresh iscsi login. (And then we go around the whole cycle of the pattern again.) The question is why has the target refused to accept any more data for over 50 seconds! The obvious conclusion would be that the OpenSolaris box is so busy that it does not have any time left to empty the network stack buffers. But this then just leads you to another question - why? So the mystery deepens, and I am running out of ideas! Tano, maybe you could check the network performance, with the 'iperf' programs, as mentioned here: http://mail.opensolaris.org/pipermail/zfs-discuss/2008-October/052136.html Does the OpenSolaris box give any indication of being busy with other things? Try running 'prstat' to see if it gives any clues. Presumably you are using ZFS as the backing store for iScsi, in which case, maybe try with a UFS formatted disk to see if that is a factor. Regards Nigel Smith -- This message posted from opensolaris.org _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss