Hey, Jim! Thanks so much for the excellent assist on this - much better than
I could have ever answered it!

I thought I'd add a little bit on the other four...

 - raising ddi_msix_alloc_limit to 8

For PCI cards that use up to 8 interrupts, which our 10GBe adapters do. The
previous value of 2 could cause some CPU interrupt bottlenecks. So far, this
has been more of a preventative measure - we haven't seen a case where this
really made any performance impact.

 - raising ip_soft_rings_cnt to 16

This increases the number of kernel threads associated with packet
processing and is specifically meant to reduce the latency in handling
10GBe. This showed a small performance improvement.

 - raising tcp_deferred_acks_max to 16

This reduces the number of ACK packets sent, thus reducing the overall TCP
overhead. This showed a small performance improvement.

 - raising tcp_local_dacks_max to 16

This also slows down ACK packets and showed a tiny performance improvement.

Overall, we have found these four settings to not make a whole lot of
difference, but every little bit helps. ;> The four that Jim went through
were much more impactful particularly the enabling of jumbo frames and the
disabling of the Nagle algorithm.

-Gray

On Tue, Oct 21, 2008 at 4:21 AM, Jim Dunham <[EMAIL PROTECTED]> wrote:

> Gary,
>
>   Sidenote: Today we made eight network/iSCSI related tweaks that, in
>>>  aggregate, have resulted in dramatic performance improvements (some I
>>>  just hadn't gotten around to yet, others suggested by Sun's Mertol
>>>  Ozyoney)...
>>>  - disabling the Nagle algorithm on the head node
>>>  - setting each iSCSI target block size to match the ZFS record size of
>>>  128K
>>>  - disabling "thin provisioning" on the iSCSI targets
>>>  - enabling jumbo frames everywhere (each switch and NIC)
>>>  - raising ddi_msix_alloc_limit to 8
>>>  - raising ip_soft_rings_cnt to 16
>>>  - raising tcp_deferred_acks_max to 16
>>>  - raising tcp_local_dacks_max to 16
>>>
>>
>> Can you tell us which of those changes made the most dramatic
>> improvement?
>>
>
>   - disabling the Nagle algorithm on the head node
>>>
>>
> This will have a dramatic effective on most I/Os, except for large
> sequential writes.
>
>  - setting each iSCSI target block size to match the ZFS record size of
>>> 128K
>>>  - enabling jumbo frames everywhere (each switch and NIC)
>>>
>>
>
> These will have a positive effect for large writes, both sequential and
> random
>
>   - disabling "thin provisioning" on the iSCSI targets
>>>
>>
> This only has a benefit for file-based or dsk based backing stores. If one
> use rdsk backing stores of any type, this is not an issue.
>
> Jim
>
>  I have a similar situation here, with a 2-TB ZFS pool on
>> a T2000 using Iscsi to a Netapp file server.  Is there any way to tell
>> in advance if any of those changes will make a difference?  Many of
>> them seem to be server resources.  How can I determine their current
>> usage?
>>
>> --
>> -Gary Mills-    -Unix Support-    -U of M Academic Computing and
>> Networking-
>> _______________________________________________
>> zfs-discuss mailing list
>> zfs-discuss@opensolaris.org
>> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
>>
>
> Jim Dunham
> Storage Platform Software Group
> Sun Microsystems, Inc.
>



-- 
Gray Carper
MSIS Technical Services
University of Michigan Medical School
[EMAIL PROTECTED]  |  skype:  graycarper  |  734.418.8506
http://www.umms.med.umich.edu/msis/
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to