Re: [zfs-discuss] Speeding up resilver on x4500

Richard Elling Tue, 23 Jun 2009 11:51:01 -0700

Erik Trimble wrote:

Richard Elling wrote:
Erik Trimble wrote:
All this discussion hasn't answered one thing for me: exactly_how_ does ZFS do resilvering? Both in the case of mirrors, and ofRAIDZ[2] ?
I've seen some mention that it goes in cronological order (which tome, means that the metadata must be read first) of file creation,and that only used blocks are rebuilt, but exactly what is themethodology being used?
See Jeff Bonwick's blog on the topic
http://blogs.sun.com/bonwick/entry/smokin_mirrors
-- richard
That's very informative. Thanks, Richard.
So, ZFS walks the used block tree to see what still needsrebuilding. I guess I have two related questions then:
(1) Are these blocks some fixed size (based on the media - usually 512bytes), or are they "ZFS blocks" - the fungible size based on therequirements of the original file size being written?


They are metadata, so they are compressed.  I would expect many of them
to be small -- though I have no data to place behind that assumption, it
wouldn't be hard to measure.

(2) is there some reasonable way to read in multiples of these blocksin a single IOP? Theoretically, if the blocks are in chronologicalcreation order, they should be (relatively) sequential on thedrive(s). Thus, ZFS should be able to read in several of them withoutforcing a random seek. That is, you should be able to get multipleblocks in a single IOP.


Metadata is prefetched. You can look at the hit rate in kstats.
Stuart, you might post the output of "kstat -n vdev_cache_stats"
I regularly see cache hit rates in the 60% range, which isn't bad
considering what is being cached.
-- richard

If we can't get multiple ZFS blocks in one sequential read, we'rescrewed - ZFS is going to be IOPS bound on the replacement disk, withno real workaround. Which means rebuild times for disks with lots ofsmall files is going to be hideous.

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Speeding up resilver on x4500

Reply via email to