Hello Bob, Wednesday, March 19, 2008, 11:23:58 PM, you wrote:
BF> On Wed, 19 Mar 2008, Bill Moloney wrote: >> When application IO sizes get small, the overhead in ZFS goes >> up dramatically. BF> Thanks for the feedback. However, from what I have observed, it is BF> not a full story at all. On my own system, when a new file is BF> written, the write block size does not make a significant difference BF> to the write speed. Similarly, read block size does not make a BF> significant difference to the sequential read speed. I do see a BF> large difference in rates when an existing file is updated BF> sequentially. There is a many orders of magnitude difference for BF> random I/O type updates. BF> I think that there some rather obvious reasons for the difference BF> between writing a new file, or updating an existing file. When BF> writing a new file, the system can buffer up to a disk block's worth BF> of size prior to issuing a a disk I/O, or it can immedialy write what BF> it has and since the write is sequential, it does not need to re-read BF> prior to write (but there may be more metadata I/Os). For the case of BF> updating part of a disk block, there needs to be a read prior to write BF> if the block is not cached in RAM. Possibly when you created a file zfs used 128KB blocks. Then if you randomly update that file the question is - what is an average update size? If it's below 128KB (and not aligned) you will basically have to read the old 128KB block first and then write it to new location. In such scenario (like oracle databases on zfs) BEFORE you create files set recordsize property to something smaller, ideally match your avg. update size. In case of Oracle matching db_block_size should give you best results most the times. -- Best regards, Robert Milkowski mailto:[EMAIL PROTECTED] http://milek.blogspot.com _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss