iostat measurements comment below...
Gary Mills wrote:
On Sat, Apr 18, 2009 at 11:45:54PM -0500, Mike Gerdts wrote:
[perf-discuss cc'd]
On Sat, Apr 18, 2009 at 4:27 PM, Gary Mills <mi...@cc.umanitoba.ca> wrote:
Many other layers are involved in this server. We use scsi_vhci for
redundant I/O paths and Sun's Iscsi initiator to connect to the
storage on our Netapp filer. The kernel plays a part as well. How
do we determine which layer is responsible for the slow performance?
Have you disabled the nagle algorithm for the iscsi initiator?
http://bugs.opensolaris.org/view_bug.do?bug_id=6772828
I tried that on our test IMAP backend the other day. It made no
significant difference to read or write times or to ZFS I/O bandwidth.
I conclude that the Iscsi initiator has already sized its TCP packets
to avoid Nagle delays.
Also, you may want to consider doing backups from the NetApp rather
than from the Solaris box.
I've certainly recommended finding a different way to perform backups.
Assuming all of your LUNs are in the same
volume on the filer, a snapshot should be a crash-consistent image of
the zpool. You could verify this by making the snapshot rw and trying
to import the snapshotted LUNs on another host.
That part sounds scary! The filer exports four LUNs that are combined
into one ZFS pool on the IMAP server. These LUNs are volumes on the
filer. How can we safely import them on another host?
Anyway, this would
remove the backup-related stress on the T2000. You can still do
snapshots at the ZFS layer to give you file level restores. If the
NetApp caught on fire, you would simply need to restore the volume
containing the LUNs (presumably a small collection of large files)
which would go a lot quicker than a large collection of small files.
Yes, a disaster recovery would be much quicker in this case.
Since iSCSI is in the mix, you should also be sure that your network
is appropriately tuned. Assuming that you are using the onboard
e1000g NICs, be sure that none of the "bad" counters are incrementing:
$ kstat -p e1000g | nawk '$0 ~ /err|drop|fail|no/ && $NF != 0'
If this gives any output, there is likely something amiss with your network.
Only this:
e1000g:0:e1000g0:unknowns 1764449
I don't know what those are, but it's e1000g1 and e1000g2 that are
used for the Iscsi network.
The output from "iostat -xCn 10" could be interesting as well. If
asvc_t is high (>30?), it means the filer is being slow to respond.
If wsvc_t is frequently non-zero, there is some sort of a bottleneck
that prevents the server from sending requests to the filer. Perhaps
you have tuned ssd_max_throttle or Solaris has backed off because the
filer said to slow down. (Assuming that ssd is used with iSCSI LUNs).
Here's an example, taken from one of the busy periods:
extended device statistics
r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device
0.0 5.0 0.0 7.7 0.0 0.1 4.1 24.8 1 1 c1t2d0
27.0 13.8 1523.4 172.9 0.0 0.5 0.0 11.8 0 38
c4t60A98000433469764E4A2D456A644A74d0
42.0 21.4 2027.3 350.0 0.0 0.9 0.0 13.9 0 60
c4t60A98000433469764E4A2D456A696579d0
40.8 25.0 1993.5 339.1 0.0 0.8 0.0 11.8 0 52
c4t60A98000433469764E4A476D2F664E4Fd0
42.0 26.6 1968.4 319.1 0.0 0.8 0.0 11.8 0 56
c4t60A98000433469764E4A476D2F6B385Ad0
I see no evidence of an I/O or file system bottleneck here. While the
service times are a little higher than I expect, I don't get worried until
the %busy is high and actv is high and asvc_t is high(er). I think your
problem is elsewhere.
NB when looking at ZFS, a 1 second interval for iostat is too small
to be useful. 10 seconds is generally better, especially for older
releases of ZFS (anything on Solaris 10).
<shameless plug>
ZFS consulting available at http://www.richardelling.com
</shamelss plug>
-- richard
_______________________________________________
perf-discuss mailing list
perf-discuss@opensolaris.org