It looks like I was posting on the wrong mailing list. I thought
this mailing list includes developers. The experiment I did is
not for commercial purpose. The purpose of comparison is to
find the optimization opportunity of the entire software stack
on both linux and solaris.

As for this zfs root lock, currently a special re-entrant rwlock
is implemented here for ZFS only. The interesting is, all http
and https request of my benchmark are performing the read
ops on apache server, while we still saw lots of mutex spin
events of this rwlock in the lockstat report . I'll continue to
investigate if this is a conflict of design philosophy. At least,
this lock does not well behave for this kind of more read, less
write case.

If anyone has interested in this topic, I can send the update
offline.

Thanks,
-Aubrey



On Tue, Mar 27, 2012 at 9:42 PM, "Hung-Sheng Tsao (Lao Tsao 老曹) Ph.D."
<laot...@gmail.com> wrote:
> hi
> you did not answer the question, what is the RAM of the server? how many
> socket and core  etc
> what is the block size of zfs?
> what is the cache  ram of your  san array?
> what is the block size/strip size  of your raid in san array? raid 5 or
> what?
> what is your test program and how (from what kind  client)
> regards
>
>
>
>
> On 3/26/2012 11:13 PM, Aubrey Li wrote:
>>
>> On Tue, Mar 27, 2012 at 1:15 AM, Jim Klimov<j...@cos.ru>  wrote:
>>>
>>> Well, as a further attempt down this road, is it possible for you to rule
>>> out
>>> ZFS from swapping - i.e. if RAM amounts permit, disable the swap at all
>>> (swap -d /dev/zvol/dsk/rpool/swap) or relocate it to dedicated slices of
>>> same or better yet separate disks?
>>>
>> Thanks Jim for your suggestion!
>>
>>
>>> If you do have lots of swapping activity (that can be seen in "vmstat 1"
>>> as
>>> si/so columns) going on in a zvol, you're likely to get much
>>> fragmentation
>>> in the pool, and searching for contiguous stretches of space can become
>>> tricky (and time-consuming), or larger writes can get broken down into
>>> many smaller random writes and/or "gang blocks", which is also slower.
>>> At least such waiting on disks can explain the overall large kernel
>>> times.
>>
>> I took swapping activity into account, even when the CPU% is 100%, "si"
>> (swap-ins) and "so" (swap-outs) are always ZEROs.
>>
>>> You can also see the disk wait times ratio in "iostat -xzn 1" column "%w"
>>> and disk busy times ratio in "%b" (second and third from the right).
>>> I dont't remember you posting that.
>>>
>>> If these are accounting in tens, or even close or equal to 100%, then
>>> your disks are the actual bottleneck. Speeding up that subsystem,
>>> including addition of cache (ARC RAM, L2ARC SSD, maybe ZIL
>>> SSD/DDRDrive) and combatting fragmentation by moving swap and
>>> other scratch spaces to dedicated pools or raw slices might help.
>>
>> My storage system is not quite busy, and there are only read operations.
>> =====================================
>> # iostat -xnz 3
>>                     extended device statistics
>>     r/s    w/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device
>>   112.4    0.0 1691.5    0.0  0.0  0.5    0.0    4.8   0  41 c11t0d0
>>                     extended device statistics
>>     r/s    w/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device
>>   118.7    0.0 1867.0    0.0  0.0  0.5    0.0    4.5   0  42 c11t0d0
>>                     extended device statistics
>>     r/s    w/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device
>>   127.7    0.0 2121.6    0.0  0.0  0.6    0.0    4.7   0  44 c11t0d0
>>                     extended device statistics
>>     r/s    w/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device
>>   141.3    0.0 2158.5    0.0  0.0  0.7    0.0    4.6   0  48 c11t0d0
>> ==============================================
>>
>> Thanks,
>> -Aubrey
>> _______________________________________________
>> zfs-discuss mailing list
>> zfs-discuss@opensolaris.org
>> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
>
>
> --
> Hung-Sheng Tsao Ph D.
> Founder&  Principal
> HopBit GridComputing LLC
> cell: 9734950840
>
> http://laotsao.blogspot.com/
> http://laotsao.wordpress.com/
> http://blogs.oracle.com/hstsao/
>
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to