Edward Ned Harvey wrote:
From: Robert Milkowski [mailto:mi...@task.gda.pl]
[In raidz] The issue is that each zfs filesystem block is basically
spread across
n-1 devices.
So every time you want to read back a single fs block you need to wait
for all n-1 devices to provide you with a part of it - and keep in mind
in zfs you can't get a partial block even if that's what you are asking
for as zfs has to check checksum of entire fs block.
Can anyone else confirm or deny the correctness of this statement?
If you read a small file from a raidz volume, do you have to wait for every
single disk to return a small chunk of the blocksize? I know this is true
for large files which require more than one block, obviously, but even a
small file gets spread out across multiple disks?
This may be the way it's currently implemented, but it's not a mathematical
requirement. It is possible, if desired, to implement raid parity and still
allow small files to be written entirely on a single disk, without losing
redundancy. Thus providing the redundancy, the large file performance,
(both of which are already present in raidz), and also optimizing small file
random operations, which may not already be optimized in raidz.
As I understand it that's the whole point of raidz. Each block is its own
stripe. If necessary the block gets broken down into 512 byte chunks to spread
it as wide as possible. Each block gets its own parity added. So if the array
is too wide for the block to be spread to all disks, you also lose space because
the stripe is not full and parity gets added to that small stripe. That means
if you only write 512 byte blocks, each write writes 3 blocks to disk, so the
net capacity goes down to one third, regardless how many disks you have in your
raid group.
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss