After I changed the recordsize to 8k, seems the read/write size is not
always 8k when using zpool iostat to check. So ZFS doesn't obey the
recordsize strictly?

UC4-zuc4arch$> zfs get recordsize
NAME                             PROPERTY
VALUE                            SOURCE
phximddb03data/zuc4arch/data01   recordsize
8K                               local
phximddb03data/zuc4arch/data02   recordsize
8K                               local



UC4-zuc4arch$>  zpool iostat  phximddb03data 1
                   capacity     operations    bandwidth
pool             used  avail   read  write   read  write
--------------  -----  -----  -----  -----  -----  -----
phximddb03data   487G   903G     13     62  1.26M  2.98M
phximddb03data   487G   903G    518      1  4.05M
23.8K                                    ===>  here a write is of size 24k
phximddb03data   487G   903G    456     37  3.58M   111K
phximddb03data   487G   903G    551      0  4.34M  11.9K
phximddb03data   487G   903G    496      8  3.86M   239K
phximddb03data   487G   903G    472    229  3.68M   982K
phximddb03data   487G   903G    499      3  3.91M  3.96K
phximddb03data   487G   903G    525    138  4.12M   631K
phximddb03data   487G   903G    497      0  3.89M      0
phximddb03data   487G   903G    562      0  4.38M      0
phximddb03data   487G   903G    337      3  2.63M  47.5K
phximddb03data   487G   903G    140     35  4.55M
4.23M                               ===> here a write is of size 128k.
phximddb03data   487G   903G    484    272  7.12M  5.44M
phximddb03data   487G   903G    562      0  4.49M   127K
phximddb03data   487G   903G    514      4  4.03M   301K
phximddb03data   487G   903G    505     27  3.99M  1.00M
phximddb03data   487G   903G    518     14  4.10M   692K
phximddb03data   487G   903G    518      1  4.11M  14.4K
phximddb03data   487G   903G    504      2  3.98M   151K
phximddb03data   487G   903G    531      3  4.17M   392K
phximddb03data   487G   903G    375      2  2.95M   380K
phximddb03data   487G   903G    304      5  2.40M   296K
phximddb03data   487G   903G    438      3  3.45M   277K
phximddb03data   487G   903G    376      0  3.00M      0
phximddb03data   487G   903G    239     15  2.84M  1.98M
phximddb03data   487G   903G    221    857  4.51M
16.8M                          ==> here a read is of size 20k.



On Thu, Dec 25, 2008 at 12:25 PM, Neil Perrin <neil.per...@sun.com> wrote:

> The default recordsize is 128K. So you are correct, for random reads
> performance will be bad as excess data is read.  For Oracle it is
> recommended
> to set the recordsize to 8k. This can be done when creating the filesystem
> using 'zfs create -o recordsize=8k <fs>'. If the fs has already been
> created then you
> can use 'zfs set recordsize=8k <fs>'  *however* this only takes effect for
> new files
> so existing databases will retain the old block size.
>
> Hope this helps:
>
> Neil.
>
>
> qihua wu wrote:
>
>> Hi, All,
>>
>> We have an oracle standby running on zfs and the database recovers very
>> very slow. The problem is the IO performance is very bad. I find the
>> recordsize of the ZFS is 128K, and the oracle block size is 8K. My
>>
>> My question is:
>> When oracle tries to write a 8k block, will zfs read in 128K and then
>> write 128K.  If that's the case, then zfs will do 16 (128k/8k=16 )times IO
>> as necessary.
>>
>>                    extended device statistics
>>    r/s    w/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device
>>
>>    0.0    0.2    0.0    1.6  0.0  0.0    6.0    7.7   0   0 md4
>>
>>    0.0    0.2    0.0    1.6  0.0  0.0    0.0    7.4   0   0 md14
>>
>>    0.0    0.2    0.0    1.6  0.0  0.0    0.0    7.6   0   0 md24
>>
>>    0.0    0.4    0.0    1.7  0.0  0.0    0.0    6.7   0   0 sd0
>>
>>    0.0    0.4    0.0    1.7  0.0  0.0    0.0    6.5   0   0 sd2
>>
>>    0.0    1.4    0.0  105.2  0.0  4.9    0.0 3503.3   0 100 ssd97
>>
>>    0.0    3.0    0.0  384.0  0.0 10.0    0.0 3332.9   0 100 ssd99
>>
>>    0.0    2.6    0.0  332.8  0.0 10.0    0.0 3845.7   0 100 ssd101
>>
>>    0.0    4.4    0.0  563.3  0.0 10.0    0.0 2272.4   0 100 ssd103
>>
>>    0.0    3.4    0.0  435.2  0.0 10.0    0.0 2940.8   0 100 ssd105
>>
>>    0.0    3.6    0.0  460.8  0.0 10.0    0.0 2777.4   0 100 ssd107
>>
>>    0.0    0.2    0.0   25.6  0.0  0.0    0.0   72.8   0   1 ssd112
>>
>>
>>
>>
>> UC4-zuc4arch$> zfs list -o recordsize
>> RECSIZE
>>   128K
>>   128K
>>   128K
>>   128K
>>   128K
>>   128K
>>   128K
>>   128K
>>   128K
>>
>> Thanks,
>> Daniel,
>> ------------------------------------------------------------------------
>>
>> _______________________________________________
>> zfs-discuss mailing list
>> zfs-discuss@opensolaris.org
>> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
>>
>>
>
>
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to