Re: [zfs-discuss] zfs recordsize change improves performance

Asif Iqbal Thu, 20 May 2010 21:18:52 -0700

On Thu, May 20, 2010 at 10:53 PM, Richard Elling
<richard.ell...@gmail.com> wrote:
> On May 20, 2010, at 7:09 PM, Asif Iqbal wrote:
>
>> On Thu, May 20, 2010 at 8:34 PM, Richard Elling
>> <richard.ell...@gmail.com> wrote:
>>> On May 20, 2010, at 11:07 AM, Asif Iqbal wrote:
>>>
>>>> On Thu, May 20, 2010 at 1:51 PM, Asif Iqbal <vad...@gmail.com> wrote:
>>>>> I have a T2000 with a dual port 4gb hba (QLE2462) and a 3510FC with
>>>>> one controller 2gb/s attached to it.
>>>>> I am running sol 10 u3 .
>
> I seemed to have missed this the first read-through.  Solaris 10u3?  Are
> you serious?  That was released nearly 5 years ago.  Has it been patched
> at all?  If not, then I think you shouldn't expect the sort of performance you
> can get with a modern release.


We just patched the failover server/storage. This is next.

>
>>>>> every time I change the recordsize of the zfs fs the disk IO improves
>>>>> (doubles) and stay like that for
>>>>> about 5 to 6 hrs. Then it dies down. I increase the recordsize again
>>>>> and performace jumps back to
>>>>> double again. The main app is oracle database with 8K blocksize
>>>>>
>>>>> I changed the zfs recordsize to from 8K to 16K and then 32K every 8
>>>>> hrs, which improved the disk IO
>
> Yes.  If the recordsize is greater than the database block size, then
> you will be doing more read/modify/write cycles which will increase
> disk I/O rates, but decrease overall performance and efficiency.
>

well application becomes happy to with every upward change. made me think
zfs cache is getting flushed with this change.

>>>>>
>>>>> I wonder if there is any other zfs parameter that I can change to keep
>>>>> the performance good, since I
>>>>> am running older sol 10.
>>>>>
>>>>> I have single disk luns on the 3510 with mpxio enabled on T2000. each
>>>>> disk has two paths (primary,primary)
>>>>> online per luxadm.
>>>>>
>>>>> zpool iostat 10 gives me only about 6MB max write bandwidth. I was
>>>>> hoping it to lot higher.
>>>>>
>>>>> the battery on 3510 is expired and waiting for a replacement.
>>>>>
>>>>> besides replacing the battery, what else can I do to improve the write
>>>>> bandwidth?
>>>>>
>>>>> does the battery expire directly affecting the oracle's disk IO? I
>>>>> thought oracle will just write to zfs and done.
>>>>> and zpool will then write-through to controller instead of write-back
>>>>> since no battery.
>>>>>
>>>>> sun storage guys found no other issue besides the battery.
>>>>>
>>>>> should disabling zil improve performance? I won't try it until we get
>>>>> the battery so not to risk data loss
>>>>> during outage.
>
> If you disable the ZIL for locally run Oracle and you have an unscheduled
> outage, then it is highly probable that you will lose data.

yep. that is why I am not doing it until we replace the battery

>
>>>>
>>>> so my 3510 is essentially behaving like a 3510 jbod but why would that
>>>> make the IO bandwidth this low?
>>>
>>> The application is not driving enough load to make the bandwidth be
>>> higher.  Why?  Because it is an Oracle database and will be making
>>> sync writes, by default. Since you do not have a working battery, those
>>> writes are taking 10-40ms each.  Replace your battery.
>>
>> is that mean, in other words oracle write io will be about 7MB/s if
>> zpool is made out of only jbods ? I am
>> assuming the disks spec 146GB 15K rpm
>
> How is the pool created?  Send the output of "zpool status poolname"
> I can't tell definitively from the iostat, but it appears that you have quite
> a bit of read/modify/write activity.

bash-3.00# zpool status mypool
  pool: mypool
 state: ONLINE
 scrub: none requested
config:

        NAME                                       STATE     READ WRITE CKSUM
        mypool                                  ONLINE       0     0     0
          raidz2                                   ONLINE       0     0     0
            c4t600C0FF0000000000A77B02A06F84B00d0  ONLINE       0     0     0
            c4t600C0FF0000000000A77B02E7F2C8C00d0  ONLINE       0     0     0
            c4t600C0FF0000000000A77B05D232D4E00d0  ONLINE       0     0     0
            c4t600C0FF0000000000A77B07E236A7A00d0  ONLINE       0     0     0
            c4t600C0FF0000000000A77B07E6593C400d0  ONLINE       0     0     0
            c4t600C0FF0000000000A77B016E1C3A800d0  ONLINE       0     0     0

errors: No known data errors


>
> You will not likely be bandwidth limited for Oracle.  You are very likely
> to be latency limited.  Until you get better latency, you won't see better
> application performance.

ok. like I mentioned to another thread, would be nice if there is a way to tell
oracle to not to sync write to disk but just to zpool. but that will
probably make
oracle angry

>  -- richard
>
>>
>>>  -- richard
>>>
>>>>
>>>> here are some iodata which make the t2000/3510 setup looks even worse
>>>>
>>>> http://pastebin.com/QeAKDbfj
>>>>
>>>>
>>>>>
>>>>> --
>>>>> Asif Iqbal
>>>>> PGP Key: 0xE62693C5 KeyServer: pgp.mit.edu
>>>>> A: Because it messes up the order in which people normally read text.
>>>>> Q: Why is top-posting such a bad thing?
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Asif Iqbal
>>>> PGP Key: 0xE62693C5 KeyServer: pgp.mit.edu
>>>> A: Because it messes up the order in which people normally read text.
>>>> Q: Why is top-posting such a bad thing?
>>>> _______________________________________________
>>>> zfs-discuss mailing list
>>>> zfs-discuss@opensolaris.org
>>>> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
>>>
>>> --
>>> ZFS and NexentaStor training, Rotterdam, July 13-15, 2010
>>> http://nexenta-rotterdam.eventbrite.com/
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>
>>
>>
>> --
>> Asif Iqbal
>> PGP Key: 0xE62693C5 KeyServer: pgp.mit.edu
>> A: Because it messes up the order in which people normally read text.
>> Q: Why is top-posting such a bad thing?
>
> --
> ZFS and NexentaStor training, Rotterdam, July 13-15, 2010
> http://nexenta-rotterdam.eventbrite.com/
>
>
>
>
>
>
>



-- 
Asif Iqbal
PGP Key: 0xE62693C5 KeyServer: pgp.mit.edu
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] zfs recordsize change improves performance

Reply via email to