[zfs-discuss] ZFS Questions. (RAID-Z questions actually)

Steven Sim Mon, 03 Jul 2006 08:17:09 -0700

Hello Gurus;

I've been playing with ZFS and reading the materials, BLOGS and FAQs.

It's an awesome FS and I just wish that Sun would evangelize a littlebit more. But that's another story.


I'm writing here to ask a few very simple questions.

I am able to understand the RAID-5 write hole and it's implications.

I am however, not able to grasp the concept of RAID-Z. More specificallythe following statements which were repeated over and over again acrossmany BLOGS, FAQ and reading materials...

From Jeff Bonwick's weblog(http://blogs.sun.com/roller/page/bonwick/20051118)

"RAID-Z is a data/parity scheme like RAID-5, but it uses dynamic stripewidth. Every block is its own RAID-Z stripe, regardless of blocksize.This means that every RAID-Z write is a full-stripe write. This, whencombined with the copy-on-write transactional semantics of ZFS,completely eliminates the RAID write hole. RAID-Z is also faster thantraditional RAID because it never has to do read-modify-write."

I understand the copy-on-write thing. That was very well illustrated in"ZFS The Last Word in File Systems" by Jeff Bonwick.

But if every block is it's own RAID-Z stripe, if the block is lost, howdoes ZFS recover the block???

Is the stripe parity (as opposed to block checksum which I understand)stored somewhere else or within the same black????

But how exactly does "every RAID-Z write is a full stripe write" works?More specifically, if in a 3 disk RAID-Z configuration, if one diskfails completely and is replaced, exactly how does the "metadata drivenreconstruction" recover the newly replaced disk?


It goes on...(and very similar statements from other sites and materials..)

"....Well, the tricky bit here is RAID-Z reconstruction. Because thestripes are all different sizes, there's no simple formula like "all thedisks XOR to zero." You have to traverse the filesystem metadata todetermine the RAID-Z geometry. Note that this would be impossible if thefilesystem and the RAID array were separate products, which is whythere's nothing like RAID-Z in the storage market today. You really needan integrated view of the logical and physical structure of the data topull it off."

Every stripe is different size? Is this because ZFS adapts to the natureof the I/O coming to it?

Could someone elaborate more on the statement "metadata drivesreconstruction"...

(I am familiar with metadata. More specifically, I am familiar with UFSand it's methodology. But the above statement I am having a littledifficulty....)


The following from zfs admin 0525..

"In RAID-Z,ZFS uses variable-width RAID stripes so that all writes arefull-stripe writes.This design is only possible because ZFS integratesle system and device management in such a way that the le system smetadata has enough information about the underlying data replicationmodel to handle variable-width RAID stripes."


I could use a little help here...

I apologies if these questions are elementary ones....

Warmest Regards
Steven Sim




Fujitsu Asia Pte. Ltd.
_____________________________________________________

This e-mail is confidential and may also be privileged. If you are not the intended recipient, please notify us immediately. You should not copy or use it for any purpose, nor disclose its contents to any other person.

Opinions, conclusions and other information in this message that do not relate 
to the official business of my firm shall be understood as neither given nor 
endorsed by it.


_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

[zfs-discuss] ZFS Questions. (RAID-Z questions actually)

Reply via email to