[perf-discuss cc'd] On Sat, Apr 18, 2009 at 4:27 PM, Gary Mills <mi...@cc.umanitoba.ca> wrote: > Many other layers are involved in this server. We use scsi_vhci for > redundant I/O paths and Sun's Iscsi initiator to connect to the > storage on our Netapp filer. The kernel plays a part as well. How > do we determine which layer is responsible for the slow performance?
Have you disabled the nagle algorithm for the iscsi initiator? http://bugs.opensolaris.org/view_bug.do?bug_id=6772828 Also, you may want to consider doing backups from the NetApp rather than from the Solaris box. Assuming all of your LUNs are in the same volume on the filer, a snapshot should be a crash-consistent image of the zpool. You could verify this by making the snapshot rw and trying to import the snapshotted LUNs on another host. Anyway, this would remove the backup-related stress on the T2000. You can still do snapshots at the ZFS layer to give you file level restores. If the NetApp caught on fire, you would simply need to restore the volume containing the LUNs (presumably a small collection of large files) which would go a lot quicker than a large collection of small files. Since iSCSI is in the mix, you should also be sure that your network is appropriately tuned. Assuming that you are using the onboard e1000g NICs, be sure that none of the "bad" counters are incrementing: $ kstat -p e1000g | nawk '$0 ~ /err|drop|fail|no/ && $NF != 0' If this gives any output, there is likely something amiss with your network. The output from "iostat -xCn 10" could be interesting as well. If asvc_t is high (>30?), it means the filer is being slow to respond. If wsvc_t is frequently non-zero, there is some sort of a bottleneck that prevents the server from sending requests to the filer. Perhaps you have tuned ssd_max_throttle or Solaris has backed off because the filer said to slow down. (Assuming that ssd is used with iSCSI LUNs). What else is happening on the filer when mail gets slow? That is, are you experiencing slowness due to a mail peak or due to some research project that happens to be on the same spindles? What does the network look like from the NetApp side? Are the mail server and the NetApp attached to the same switch, or are they at opposite ends of the campus? Is there something between them that is misbehaving? -- Mike Gerdts http://mgerdts.blogspot.com/ _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss