Hey, Jim! Thanks so much for the excellent assist on this - much better than I could have ever answered it!
I thought I'd add a little bit on the other four... - raising ddi_msix_alloc_limit to 8 For PCI cards that use up to 8 interrupts, which our 10GBe adapters do. The previous value of 2 could cause some CPU interrupt bottlenecks. So far, this has been more of a preventative measure - we haven't seen a case where this really made any performance impact. - raising ip_soft_rings_cnt to 16 This increases the number of kernel threads associated with packet processing and is specifically meant to reduce the latency in handling 10GBe. This showed a small performance improvement. - raising tcp_deferred_acks_max to 16 This reduces the number of ACK packets sent, thus reducing the overall TCP overhead. This showed a small performance improvement. - raising tcp_local_dacks_max to 16 This also slows down ACK packets and showed a tiny performance improvement. Overall, we have found these four settings to not make a whole lot of difference, but every little bit helps. ;> The four that Jim went through were much more impactful particularly the enabling of jumbo frames and the disabling of the Nagle algorithm. -Gray On Tue, Oct 21, 2008 at 4:21 AM, Jim Dunham <[EMAIL PROTECTED]> wrote: > Gary, > > Sidenote: Today we made eight network/iSCSI related tweaks that, in >>> aggregate, have resulted in dramatic performance improvements (some I >>> just hadn't gotten around to yet, others suggested by Sun's Mertol >>> Ozyoney)... >>> - disabling the Nagle algorithm on the head node >>> - setting each iSCSI target block size to match the ZFS record size of >>> 128K >>> - disabling "thin provisioning" on the iSCSI targets >>> - enabling jumbo frames everywhere (each switch and NIC) >>> - raising ddi_msix_alloc_limit to 8 >>> - raising ip_soft_rings_cnt to 16 >>> - raising tcp_deferred_acks_max to 16 >>> - raising tcp_local_dacks_max to 16 >>> >> >> Can you tell us which of those changes made the most dramatic >> improvement? >> > > - disabling the Nagle algorithm on the head node >>> >> > This will have a dramatic effective on most I/Os, except for large > sequential writes. > > - setting each iSCSI target block size to match the ZFS record size of >>> 128K >>> - enabling jumbo frames everywhere (each switch and NIC) >>> >> > > These will have a positive effect for large writes, both sequential and > random > > - disabling "thin provisioning" on the iSCSI targets >>> >> > This only has a benefit for file-based or dsk based backing stores. If one > use rdsk backing stores of any type, this is not an issue. > > Jim > > I have a similar situation here, with a 2-TB ZFS pool on >> a T2000 using Iscsi to a Netapp file server. Is there any way to tell >> in advance if any of those changes will make a difference? Many of >> them seem to be server resources. How can I determine their current >> usage? >> >> -- >> -Gary Mills- -Unix Support- -U of M Academic Computing and >> Networking- >> _______________________________________________ >> zfs-discuss mailing list >> zfs-discuss@opensolaris.org >> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss >> > > Jim Dunham > Storage Platform Software Group > Sun Microsystems, Inc. > -- Gray Carper MSIS Technical Services University of Michigan Medical School [EMAIL PROTECTED] | skype: graycarper | 734.418.8506 http://www.umms.med.umich.edu/msis/
_______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss