On 8/10/07, Moore, Joe <[EMAIL PROTECTED]> wrote: > Wishlist: It would be nice to put the whole redundancy definitions into > the zfs filesystem layer (rather than the pool layer): Imagine being > able to "set copies=5+2" for a filesystem... (requires a 7-VDEV pool, > and stripes via RAIDz2, otherwise the zfs create/set fails)
Yes please ;) This is practically the holy grail of "dynamic raid" - the ability to dynamically use different redundancy settings on a per-directory level, and to use a mix of different sized devices and add/remove them at will. I guess one would call this feature (ditto block setting of stripe+parity). It's doable but probably requires large(ish) changes to on-disk structures as block pointer will look different. James, did you look at this? With vdev removal (which I suppose will be implemented with some kind of "rewrite block" -type code) in place, "reshape" and rebalance functionality would propably be relatively small improvements. BTW here's more wishlist items now that we're at it: - copies=max+2 (use as many stripes as possible, with border case of 3-way mirror) - minchunk=8kb (dont spread smaller stripes than this - performance optimization) - checksum on every disk independently (instead of full stripe) - fixes raidz random read performance .. And one crazy idea just popped into my head: fs-level raid could be implemented with separate parity blocks instead of the ditto mechanism. Say, when data first is written, normal ditto block is used. Then later, asynchronously, the block is combined with some other blocks (that may be unrelated), the parity is written to a new allocation and the ditto block(s) are freed. When data blocks are freed (by COW) the parity needs to be recalculated before the data block can actually be forgotten. This can be thought of as combining a number of ditto blocks into a parity block. That may be easier or more complicated to implement than saving the block as stripe+parity in the first place. Depends on the data structures, which I don't yet know intimately. Come to think of this, it's probably best to get all these ideas out there _before_ I start looking into the code - knowing the details has the tendency to kill all the crazy ideas :) _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss