which gap?
'RAID-Z should mind the gap on writes' ?
Message was edited by: thometal
I believe this is in reference to the raid 5 write hole, described
here:
http://en.wikipedia.org/wiki/Standard_RAID_levels#RAID_5_performance
It's not.
So I'm not sure what the 'RAID-Z should mind the gap on writes'
comment is getting at either.
Clarification?
I'm planning to write a blog post describing this, but the basic
problem is that RAID-Z, by virtue of supporting variable stripe writes
(the insight that allows us to avoid the RAID-5 write hole), must
round the number of sectors up to a multiple of nparity+1. This means
that we may have sectors that are effectively skipped. ZFS generally
lays down data in large contiguous streams, but these skipped sectors
can stymie both ZFS's write aggregation as well as the hard drive's
ability to group I/Os and write them quickly.
Jeff Bonwick added some code to mind these gaps on reads. The key
insight there is that if we're going to read 64K, say, with a 512 byte
hole in the middle, we might as well do one big read rather than two
smaller reads and just throw out the data that we don't care about.
Of course, doing this for writes is a bit trickier since we can't just
blithely write over gaps as those might contain live data on the disk.
To solve this we push the knowledge of those skipped sectors down to
the I/O aggregation layer in the form of 'optional' I/Os purely for
the purpose of coalescing writes into larger chunks.
I hope that's clear; if it's not, stay tuned for the aforementioned
blog post.
Adam
--
Adam Leventhal, Fishworks http://blogs.sun.com/ahl
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss