Chris Csanady wrote:
I believe I have tracked down the problem discussed in the "low
disk performance thread." It seems that an alignment issue will
cause small file/block performance to be abysmal on a RAID-Z.
metaslab_ff_alloc() seems to naturally align all allocations, and
so all blocks will be aligned to asize on a RAID-Z. At certain
block sizes which do not produce full width writes, contiguous
writes will leave holes of dead space in the RAID-Z.
What I have observed with the iosnoop dtrace script is that the
first disks aggregate the single block writes, while the last disk(s)
are forced to do numerous writes every other sector. If you would
like to reproduce this, simply copy a large file to a recordsize=4k
filesystem on a 4 disk RAID-Z.
Why would I want to set recordsize=4k if I'm using large files?
For that matter, why would I ever want to use a recordsize=4k, is
there a database which needs 4k record sizes?
It would probably fix the problem if this dead space was explicitly
zeroed to allow the writes to be aggregated, but that would be
an egregious hack. If the alignment constraints could be relaxed
though, that should improve the parity distribution, as well as get
rid of the dead space and associated problem.
This is one of those things I wanted to look at in my copious spare
time. Has anyone else done similar analysis?
-- richard
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss