On Sun, Jun 22, 2008 at 15:37, Bob Friesenhahn <[EMAIL PROTECTED]> wrote: > Keep in mind that ZFS checksums all data, the checksum is stored in a > different block than the data, and that if ZFS were to checksum on the > stripe segment level, a lot more checksums would need to be stored. > All these extra checksums would require more data access, more I think the question is more "why segment in the first place?". If ZFS kept everything in recordsize-blocks that reside on one disk each (or two places, if there is mirroring going on) and made parity just another recordsized-block, one could avoid the penalty of seeking every disk for every read.
The downside of this scheme would be deletes---if you actually free blocks, then the parity is useless. So you'd need to do something like keep the old useless block around and put its neighbors in the parity in a list of blocks to be re-paritied. Then when new parity has been regenerated, you can actually free the block. An advantage this would have would be changing width of raidz/z2 groups: if another disk is added, one can mark every block as needing new parity of width N+1, and let the re-parity process do its thing. This would take a while, of course, but it would add the expandability that people have been asking for. > Perhaps the solution is to install more RAM in the system so that the > stripe is fully cached and ZFS does not need to go back to disk prior > to writing an update. I don't think the problem is that the stripe is falling out of cache, but that it costs so much to get it into memory in the first place. Will _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss